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DESCRIPTION 
LIPOPROTEINS AS NUCLEIC ACID VECTORS 

BACKGROUND OF THE INVENTION 

The present application is a continuation-in-part of co-pending U.S. Patent Application 
Serial No. 08/874,807 Entitled "Lipoproteins As Nucleic Acid Vectors" filed June 13, 1997. 
The entire text of the above-referenced disclosure is specifically incorporated by reference 
herein without disclaimer. 

L Field of the Invention 

The present invention relates to materials and methods for the in vivo transport and 
delivery of nucleic acids. More particularly, it concerns the use of lipoproteins, including but 
not limited to* low density lipoproteins ("LDL"), and/or apolipoproieins for the in vivo transport 
of nucleic acids. In addition, the present invention relates to the use of lipoproteins in the early 
detection of cancer and/or metastatic cancer and/or arteriosclerosis. 

2. Description of Related Art 

The ultimate curative method for any genetic disorder, whether the disorder is inherited 
or results from a mutation, depends on an effective mode of replacing or augmenting non- 
functional gene(s). This process is now termed gene or genetic therapy. There are two 
important aspects to genetic therapy, the gene delivery system/vehicle and the gene 
control/expression program. Ideally, a replacement gene should become resident in the genome 
of the target ceils/organism and be transferable to subsequent generations of cells and progeny, 
i.e.y the change is incorporated into the germ cells or reproductive cells, the sperm and ovary. 
Although there have been several significant breakthroughs in this field, this area of 
biotechnology is still in its early development phase. The first step in any approach to gene 
replacement is the delivery of the specific gene (nucleic acid) to the cells. 
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Many techniques have been and are being developed to deliver and express genes in 
cells and specific tissues in mammals in vivo. Several general, non-specific methods for 
delivering genes have been reported involving aerosol nucleic acid deliver)' to cells (Stribling' et 
aL, 1992); calcium phosphate precipitation, using a steep change in ionic strength (Wigler et al^ 
1979); DEAE-dextran (Sompayrac et al^ 1981); electroporation, forcing the nucleic acid into 
the cell by using an electric field or current (Neumann et al, 1982); microinjection, physically 
injecting the nucleic acid into a cell (Benvensty et al, 1986; Wolff et aly 1990); and 
polycationic molecules such as polylysine polypeptides (Curiel et ai., 1992) and cationic lipids 
(Lee et al, 1996). 

Liposomes, vesicles composed of synthetic or non-natural lipids such as long-chain fatty 
adds, can be used to carry the nucleic acid into the cell cytoplasm non-specifically (Feigner et 
ai, 1987). A recent invention describes the delivery of a self-initiating and self-sustaining gene 
expression system which contains an RNA polymerase prebound to a DNA molecule using the 
aforementioned nucleotide delivery systems (U.S. Patent No. 5,591,601). 

Viral vectors in which specific nucleic acid sequences are incorporated into a neutralized 
or inactivated virus can use their viral entry mechanism to gain entry to the cell cytoplasm via 
specific cellular receptors to deliver nucleic acids (Schimotohono et ai, 1981). The use of 
specific cellular receptors is apparently a more specific method for delivering genes. In this 
approach, the nucleic acid is bound either fi^eely, through charge association, or alternatively it 
is chemically and non-reversibly conjugated to proteins with specific receptor proteins on the 
membrane of target cells for receptor-mediated uptake (Wu et ai, 1988, Wu et al, 1989), 

Techniques such as calcium phosphate precipitation, electroporation or DEAE-dextran 
transfection are not suitable for in vivo applications. Bombarding cells with nucleic acids under 
high pressure is a technique which has very limited applications in that it can only be applied 
topically and only a small number of cells can be targeted. Microinjection of nucleic acids into 
cells is mainly performed in vitro and requires actively dividing cells. 
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Gene delivery systems that use the viral entry mechanism of recombinant viral vectors 
have major disadvantages. Systems that utilize replication-defective adenoviral vectors can 
infect a wide variety of eukaryotic cell types including quiescent somatic cells utilizing the viral 
entry mechanism. However, adenoviral vector-based delivery systems are only successful in 
5 transient gene expression and repeated administration of the viral vector results in a strong 
immimological response of the host. In addition, the host will experience an adenoviral 
infection and can experience its symptoms if the recombinant vector undergoes homologous 
recombination with the wild-type virus strain. Systems that employ recombinant retroviral 
vectors can be used for stable integration of the gene of interest into the host's genome, but only 
10 actively dividing cells can be targeted. In addition, the disadvantages of the adenoviral vector 
systems also apply to retroviral vector systems (immune response, disease etc.). 

Positively-charged polycationic molecules such as polylysine peptides which bind non- 
specifically to the negatively charged nucleic acids have been used to introduce DNA into the 

15 chromosome of the recipient cell or organism. Cationic lipid vesicles, liposomes and micelles 
have been used in aggregates with DNA and viral envelope glycoproteins in non-specific 
delivery of genes. Liposomes, vesicles composed of synthetic or non-natural lipids, such as 
long-chain fatty acids, can be used to carry the nucleic acid into the cell cytoplasm non- 
specifically. In these systems, the liposomes are structured to "best fit" the nucleic acid and 

20 insertion into the cell is through non-specific uptake. 

The interaction of the liposomal delivery systems discussed above with the nucleic acid 
to be delivered is non-specific. In addition, prior art techniques are designed to deliver multiple 
copies of the nucleic acid to the cell cytoplasm. Optimally, however, only one or two copies of 
25 a gene should be transfected per cell throughout the organism to replace a defective set of genes 
only in the specific cells and tissues where it would normally be expressed. 

Thus there is a need for a safe and efficient gene delivery system that may be employed 
in the burgeoning filed of gene therapy. 

30 
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SUMMARY OF THE PRESENT INVENTION 

The present invention contemplates a gene delivery system for use in gene therapy. 
Thus in particular embodiments, the present invention provides a composition comprising an 
isolated polypeptide comprising at least one LDL or VLDL nucleic acid binding domain; and a 
5 nucleic acid comprising an LDL or VLDL binding sequence, wherein the nucleic acid is bound 
to the polypeptide. In particularly preferred embodiments, the polypeptide comprises an LDL 
nucleic acid binding domain. In other embodiments, the polypeptide comprises a VLDL 
nucleic acid binding domain. In particular aspects of the present invention, the nucleic acid 
comprises an expression region operably linked to a promoter active in eukaryotic cells. In 
10 more particular embodiments, the expression region encodes a polypeptide. In other preferred 
embodiments, the expression region comprises an antisense construct. 

In those embodiments in which the expression region encodes a polypeptide, the 
polypeptide may be selected from the group consisting of a-globin. P-globin, y-globin, 

15 granulocyte macrophage-colony stimulating factor (GM-CSF), tumor necrosis factor (TNF), IL- 
2, ILO, IL-4, IL-5, IL-6, IL-7. IL-8, IL-9, IL-10, IL-11, IL-12, IL-13, IL-14, IL-15, (3- 
interferon, y-interferon, cytosine deaminase, adenosine deaminase, (J-glucuronidase, 
hypoxanthine guanine phosphoribosyl transferase, galactose- 1 -phosphate uridyliransferase, 
glucocerbrosidase. glucose-6-phosphatase, thymidine kinase, lysosomal glucosidase, growth 

20 hormone, nerve growih factor, insulin, adrenocorticotropic hormone, parathormone, follicle- 
stimulating hormone, luteinizing hormone, epidermal growth factor, thyroid stimulating 
hormone of CFTR, EGFR, VEGFR, IL-2 receptor, estrogen receptor, Bax, Bak, Bcl-Xs, Bik, 
Bid, Bad, Harakiri, Ad El B, an ICE-CED3 protease neomycin resistance, luciferase, adenine 
phosphoribosyl transferase (APRT), retinoblastoma, insulin, mast cell growth factor, p53, pi 6, 

25 p21, MMACl, p73, zacl and BRCAL 

In those embodiments in which the expression region comprises an antisense construct, 
the antisense construct may be complementary to a segment of an oncogene. In more preferred 
embodiments, the oncogene may be selected from the group consisting of ras, myc, neu, raf 
30 erb, src, fins, jun, trk rei gsp, hst, bcl and abL 
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The expression region may be linked to a promoter selected from the group consisting of 
CMV IE, LTR, SV40 IE, HSV tk, P-actin, human globin a. human globin p and human globih y 
promoter. In a defined embodiment, the nucleic acid binding domain is an apoBlOO nucleic 
S acid binding domain. In other embodiments, the composition of the present invention may 
further comprise one or more lipoproteins selected from the group consisting of apoAl, apoA- 
II, apoA-IV, acat, apoE, apoC-II, apoC-III and apo-D. In particularly preferred embodiment, 
the apoBlOO is selected from the group consisting of human, rat and baboon apoBlOO. 

10 In particular aspects of the invention, the polypeptide comprises at least two nucleic acid 

binding domains. In particularly preferred embodiments, the nucleic acid binding domain 
contains a motif selected from the group consisting of a proline pipe helix DNA binding motif, 
a ISGF3y-like DNA binding motif, a SREBP-like DNA binding motif, a coiled-coil motif and a 
nucleotide (ATP)-binding motif. In more defined embodiments, the binding domain may be 

15 selected from the group consisting of SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID 
NO:82, SEQ ID NO:83, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, SEQ ID NO:88, SEQ 
ID NO:89, SEQ ID NO:90, SEQ ID N0:91, SEQ ID NO:92. SEQ ID NO:93, SEQ ID NO:94, 
SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:98, SEQ ID NO:99, SEQ ID 
NOrlOO, SEQ ID NO:10K SEQ ID NO:102, SEQ ID NO:103. SEQ ID NO:105, SEQ ID 

20 NO:106, SEQ ID NO:107, SEQ ID NO:108, SEQ ID NO:109. SEQ ID NO:110, SEQ ID 
N0:1 1 1, SEQ ID NO: 1 12, N0:113, SEQ ID N0:1 14, SEQ ID NO: 115, SEQ ID NO:144, SEQ 
ID NO:145, SEQ ID NO:146. SEQ ID NO:147, SEQ ID NO:148, SEQ ID NO:149, SEQ ID 
NO: 150, SEQ ID N0:151, SEQ ID NO: 152, SEQ ID NO: 153, SEQ ID NO: 154, SEQ ID 
NO: 1 63, SEQ ID NO: 1 64, SEQ ID NO: 1 65, SEQ ID NO: 1 66 and SEQ ID NO: 1 75. 

25 

In other embodiments, the polypeptide may further comprise at least one nuclear 
localization sequence. More particularly, the nuclear localization sequence may be from 
apoBlOO. In more preferred embodiments, the nuclear localization sequence may be selected 
fi-om the group consisting of SEQ ID NO: 178, SEQ ID NO: 179, SEQ ID NO: 180, SEQ ID 
30 NO: 194, SEQ ID NO: 195, SEQ ID NO: 196, SEQ ID NO: 197, SEQ ID NO: 198, SEQ ID 
NO: 199, SEQ ID NO: 200, SEQ ID NO: 201, SEQ ID NO: 202, SEQ ID NO: 203, SEQ ID 
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NO: 204, SEQ ID NO: 205, SEQ ID NO: 206, SEQ ID NO: 207, SEQ ID NO: 208. SEQ ID 
NO: 209, SEQ ID NO: 210. 

Also contemplated by the present invention is a method for expressing a polypeptide in 
5 a human cell comprising the steps of providing a composition comprising (i) an isolated 
polypeptide comprising at least one LDL or VLDL nucleic acid binding domain and (ii) a 
nucleic acid comprising an expression cassette comprising a sequence encoding the polypeptide 
and a promoter active in eukaryotic cells, wherein the coding sequence is operably linked to the 
promoter, and wherein the nucleic acid sequence is bound to the LDL or VLDL; contacting the 
10 composition with the cell under conditions permitting transfer of the composition into the cell; 
and culturing the cell under conditions permitting the expression of the polypeptide. 



In particulariy preferred embodiments, the polypeptide independently, is a tumor 
suppressor, a cytokine, an enzyme, a hormone, a receptor, or an inducer of apoptosis. In 

15 preferred embodiments, the tumor suppressor may be selected from the group consisting of p53, 
pl6, p21, MMACl, p73, zacl, BRCAI and Rb. In preferred embodiments, the cytokine may be 
selected from the group consisting of IL-2, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, 
IL-11, IL-12. IL-13, IL-14, IL-15, TNF, GMCSF. P-interferon and y-interferon. In other 
preferred embodiments, the enzyme may be selected from the group consisting of cytosine 

20 deaminase, adenosine deaminase. P-glucuronidase. hypoxanthine guanine phosphoribosyl 
transferase, galactose- 1 -phosphate uridyltransferase, glucocerbrosidase, glucose-6-phosphatase, 
thymidine kinase and lysosomal glucosidase. In still further preferred embodiments, the 
hormone may be selected from the group consisting of growth hormone, nerve growth factor, 
insulin, adrenocorticotropic hormone, parathormone, follicle-stimulating hormone, luteinizing 

25 hormone, epidermal growth factor and thyroid stimulating hormone. In defined embodiments, 
the receptor may be selected from the group consisting of CFTR, EGFR, VEGFR, IL-2 receptor 
and the estrogen receptor. In other preferred embodiments, the inducer of apoptosis may be 
selected from the group consisting of Bax, Bak, Bcl-Xj, Bik, Bid, Bad, Harakiri, Ad El B and an 
ICE-eED3 protease. 

30 
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In particularly preferred embodiments, the nucleic acid binding domain is an apoBlOO 
nucleic acid binding domain. In more preferred embodiments, the apoBlOO may be selected 
from the group consisting of human, rat and baboon low density apoBIOO. In still further 
preferred embodiments, the binding region is selected from the group consisting of a proline 
5 pipe helix DNA binding motif, a ISGF3Y-like DNA binding motif, a SREBP-like DNA binding 
motif, a coiled-coil motifs, and a nucleotide (ATP)-binding motif. In particular embodiments, 
the polypeptide further may comprise at least one nuclear localization sequence. In especially 
preferred embodiments, the nuclear localization sequence is derived from an apoBlOO nuclear 
localization sequence. In specific embodiments, the polypeptide may be selected from the 
10 group consisting of a-giobin. (i-globin, y-globin, neomycin resistance, luciferase, adenine 
phosphoribosyl transferase ( APRT), and mast cell growth factor. 

Also provided is a method for providing an expression construct to a human cell 
comprising providing a composition comprising (i) an isolated polypeptide comprising at least 

15 one LDL or VLDL nucleic acid binding domain and (ii) an expression cassette comprising a 
nucleic acid sequence encoding an expression region and a promoter active in eukaryotic cells, 
wherein the expression region is operably linked to the promoter, and wherein the nucleic acid 
sequence is bound to the LDL or VLDL; contacting the composition with the cell under 
conditions permitting transfer of the composition into the cell; and culturing the cell under 

20 conditions permitting the expression of the expression region. 

In particularly preferred embodiments, the expression construct comprises an antisense 
construct. In more preferred embodiments, the antisense construct is derived from an oncogene. 
In exemplary embodiments, the oncogene may be selected from the group consisting ras, myc» 
IS neu, raf erk src, fins, jun, trk ret, gsp, hst, bcl and abl. In other embodiments, the expression 
construct comprises a nucleic acid coding for a gene. In preferred aspects the gene encodes a 
polypeptide. 

In particularly preferred embodiments, the nucleic acid binding domain is an apoBlOO 
30 nucleic acid binding domain. The apoBlOO may be selected from the group consisting of 
human, rat and baboon low density apoBlOO. In other preferred embodiments, the DNA 
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binding region is selected from the group consisting of a proline pipe helix DNA binding motif, 
a ISGF3y-like DNA binding motif, a SREBP-like DNA binding motif, a coiled-coil motifs, and 
a nucleotide (ATP)-binding motif 

5 Further the present invention contemplates a method for treating a human disease 

comprising providing a composition comprising (i) an isolated polypeptide comprising at least 
one LDL or VLDL nucleic acid binding domain and (ii) an expression cassette comprising a 
nucleic acid sequence encoding an expression region and a promoter active in eukaryotic cells, 
wherein the expression region is operably linked to the promoter, and wherein the nucleic acid 
10 sequence is bound to the LDL or VLDL; and administering the composition to a human subject 
having the disease under conditions permitting transfer of the composition into cells of the 
human subject. 

In specific embodiments, the disease may be selected from the group consisting of 
15 cancen diabetes, cystic fibrosis and arteriosclerosis. In preferred embodiments the polypeptide 
comprises at least two nucleic acid binding regions. In other preferred embodiments the 
polypeptide comprises at least one nuclear localization sequence. In particularly preferred 
embodiments, the nucleic acid encodes a gene. In other preferred embodiments, the expression 
construct comprises an antisense construct. 

20 

Another aspects of the present invention describes a pharmaceutical composition 
comprising an isolated polypeptide comprising at least one LDL or VLDL nucleic acid binding 
domain; and a nucleic acid comprising an LDL or VLDL binding sequence, wherein the nucleic 
acid is bound to the polypeptide; the pharmaceutical composition being dispersed in a suitable 
25 diluent. 

Also contemplated by the present invention is a method of transforming a cell 
comprising providing a cell; contacting the cell with a composition comprising (i) an isolated 
polypeptide comprising at least one LDL or VLDL nucleic acid binding domain and (ii) an 
30 expression cassette comprising a nucleic acid sequence encoding an expression region and a 
promoter active in eukaryotic cells, wherein the expression region is operably linked to the 
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promoter, and wherein the nucleic acid sequence is bound to the LDL or VLDL; wherein 
expression of the expression region is indicative of the transformation. 

Yet another aspect of the present invention contemplates a method of transfecting a ceil 
5 comprising the steps of providing a cell; contacting the cell with a composition comprising (i) 
an isolated polypeptide comprising at least one LDL or VLDL nucleic acid binding domain and 
(ii) an expression cassette comprising a nucleic acid sequence encoding an expression region 
and a promoter active in eukaryotic cells, wherein the expression region is operably linked to 
the promoter, and wherein the nucleic acid sequence is bound to the LDL or VLDL; wherein 
10 expression of the expression region is indicative of the transfection. 

Other objects, features and advantages of the present invention will become apparent 
from the following detailed description. It should be understood, however, that the detailed 
description and the specific examples, while indicating preferred embodiments of the invention, 
15 are given by way of illustration only, since various changes and modifications within the spirit 
and scope of the invention will become apparent to those skilled in the art from this detailed 
description. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The following drawings form part of the present specification and are included to further 
20 demonstrate certain aspects of the present invention. The invention may be better understood 
by reference to one or more of these drawings in combination with the detailed description of 
specific embodiments presented herein. 

FIG. 1 A-FIG. IC show the amino acid sequence of apoB-100, 

25 

FIG. 2 A-FIG. 2F is a homology alignment of SH3-like regions in apo B-lOO with 
known SH3 domains of signal transduction proteins. FIG. 2A-FIG. 2D are the homology 
alignments and FIG. 2E and FIG. 2F identify the regions of apo B-lOO and the proteins aligned. 
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FIG. 3A-FIG. 3D show a comparison of SH2-like regions in apo B-lOO to known SH3 
domains of signal transduction proteins, FIG. 3A-FIG. 3C are the homology alignments, 
FIG. 3D identifies the proteins and regions aligned. 

5 FIG. 4A-FIG. 4C show a comparison of the apo B-lOO SH l-like region to SHI kinase 

domains of known signal transduction proteins. FIG. 4A and FIG. 4B shows the alignments; 
FIG. 4C identifies the proteins and regions aligned. 

FIG. 5A and FIG. 5B show the inter-kringle proline-rich regions of Apo[a] compared 
10 with the proline rich region of SH3-binding protein (3BP1). FIG. 5A shows the aligrunent;, 
FIG. 5B identifies the proteins and regions aligned. 

FIG. 6A and FIG. 6B show an homology alignment of specific regions of apo B-lOO and 
the activation regions located at the amino- and carboxyl- termini of signal transduction 
15 proteins. 

FIG. 7 illustrates the homology of specific regions of apo B-lOO with proline pipe helix 
motifs of Tus. 



20 FIG. 8A-FIG. 8D show a homology alignment among one region of the DNA-binding 

protein ISGF3y and similar regions in apo B-100. 

FIG. 9A-FIG. 9D show a homology alignment among regions of the DNA-binding 
protein ISGF3y and similar regions in apo B-100. 

25 

FIG. 1 OA-FIG. ION. FIG. ION shows a sequence comparison of the DNA-binding 
domains of the SREBPl, SREBP2, and ADDl proteins with similar regions found in apo 
B-100. FIG. 1 OB-FIG. ION show a sequence comparison of the DNA-binding domains of 
SREBPl with various apolipoproteins. 

30 
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FIG. 1 1 shows a comparison of the primary structures of known coiled-coil regions of 
DNA-binding proteins and analogous regions in apo B-100. 

FIG. 12A-FIG. 12C show a comparison of known ATP-binding loop motifs to similar 
S regions in apo B- 1 00. 

FIG. 13A-FIG. 13E show a comparison of known nuclear localization signal sequences 
to similar regions in apo B-100. 

10 FIG. 14A-FIG. 14J show a comparison of human apo B-lOO regions with sequenced 

regions of apo B-lOO from other species. 

FIG. 15 shows the composiiion of the LDL gene delivery system of the instant invention 
LDL containing apo B-lOO is depicted along with a DNA sequence containing a promoter, a 
1 5 protein coding region, a 3' untranslated region, and a non-coding region. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

The present invention arises from the discovery that regions of apolipoproteins, the 
protein fraction of lipoprotein particles, are similar in primary structure and amino acid 
sequence to cellular proteins which are known to bind to DNA. Presently, the only known 
20 functions of lipoproteins VLDL. IDL. LDL and HDL are the solubilization and transport of 
hydrophobic lipids in plasma. The instant invention shows that LDLs, but not other 
lipoproteins, form a complex with DNA. 

Herein, synthetic analogues of regions of DNA have been shown to bind to highly 
25 purified preparations of human, rat, and baboon LDL but not to other human lipoproteins such 
as VLDL and HDL, nor to mouse lipoproteins. In fact, the differences observed among the four 
species tested suggests that human, rat, and baboon lipoproteins behave very similarly in terms 
of DNA binding preference. Further, purified preparations of human, rat, and baboon LDLs are 
shown to complex with the promoter region of the human cytomegalovirus. Thus, the present 
30 invention demonstrates that human LDL complexes with specific regions of genomic DNA. 
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Because lipoproteins have specific cell membrane receptors and are actively and 
specifically internalized by many different cell types in mammals, and because the inventors 
show that LDL can bind DNA, these lipoproteins can be used as gene delivery vectors. More 
S specifically, this invention relates to materials and methods for the use of lipoproteins, such as 
LDL, or, for example, apoiipoproteins such as, but not limited to, apoB-100, apoAl, apoE, 
apoAIV, and apoC, or more specifically still, the DNA binding regions of these lipoproteins, as 
gene delivery vectors in vivo. As explained in greater detail below, the various embodiments of 
this invention include, but are not limited to, the delivery of nucleic acids to a cell in the form of 
10 an LDL-lipoprotein complex, the specific delivery of DNA to the nucleus, and the specific 
localization of delivered DNA to specific nuclear sites. 

Plasma levels of DNA increase in a variety of chronic diseases including lupus 
erythrematosis (Steinman, 1984), viral hepatitis (Neurath et ai, 1984), and a variety of cancers 

15 (Leon et aL, 1977; Shapiro et aU 1983; Stroun et aL, 1987; Nawroz et ciL, 1996; Anker et aL, 
1997; Chen et al., 1996). It further has been shown that lipoproteins in the blood of non-tumor 
carrying organisms are not bound to nucleic acids. However, cancer-carrying individuals, and 
in particular individuals with metastatic cancers, release large amounts of nucleic acids, into 
the blood. Thus, this invention also relates to the observation that lipoproteins in the blood of 

20 cancer patients and especially metastatic cancer patients are bound to nucleic acids, including 
DNA. Accordingly, this invention also may be used to provide a simple screening test for the 
presence or absence of cancer, especially metastatic cancer, by isolating a patient's lipoproteins 
and determining whether the lipoproteins are bound to nucleic acids; the presence of 
lipoprotein-bound nucleic acid being correlative with the presence of cancer and/or metastatic 

25 cancer in the living body. Further embodiments of the present invention relate to the sequence 
specific detection of DNA bound to lipoproteins in a cancer patient as a method for the 
identification of specific types of cancer in a living body. These and other aspects of the 
present invention are discussed in greater detail below. 
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L LIPOPROTEINS 

Lipoproteins appear as micro-pseudomicellar particles in the blood plasma of all 
manunalian species including humans. Their major function is to transport lipids and other 
hydrophobic compounds (/.e., fat-soluble vitamins) through the aqueous environment of the 
S blood stream to their specific target cells. The transported lipids can be used as a major 
substrate for energy metabolism {i.e., triglycerides), structural components for cell membranes 
{i.e., phospholipids and cholesterol), or as precursors for steroid hormones and bile acids {i.e., 
cholesterol). Although, lipoproteins vary widely in size and lipid content, they have a common 
general structure. Lipoprotein particles are believed to be spherical and consist of a 
10 hydrophobic core containing nonpolar lipids surrounded by a hydrophilic surface monolayer of 
polar lipids and proteins, which are called apolipoproteins. 

Plasma lipoproteins may be separated into five major classes based on their density, 
size, and compositional and functional properties: 1) chylomicrons, 2) very low density 
15 lipoproteins (VLDL), 3) intermediate lipoproteins (IDL), 4) low density lipoproteins (LDL), 
and 5) high density lipoproteins (HDL). The different classes of lipoproteins show distinct 
compositional differences in apolipoprotein content. The specific role of each class of 
lipoproteins in lipid metabolism is determined by the interaction of these apolipoproteins with 
specific enz\'mes and cellular receptors. 

20 

a. ApoBlOO Structure and Function 

The major protein constituent of LDL is apoB-100. ApoB-100 is one of two known 
natural ligands for the LDL (apoE/apoB) receptor which is found on the surface of a wide 
variety of mammalian cell types (Brovsm and Goldstein, 1986). LDLs are taken up by a process 
25 called receptor-mediated endocytosis (Brown and Goldstein, 1986). Hence, lipoproteins may 
be able to function as naturally-occurring liposomes which contain protein constituents that can 
bind specifically to nucleic acids and can be intemalized by a wide variety of eukaryotic cell 
types via specific receptor mediated processes. 

30 Human apolipoprotein B-lOO (apoB-100) is a major apoprotein component of very-low 

density lipoproteins (VLDL), intermediate density lipoproteins (IDL), low density lipoproteins 
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(LDL), and lipoprotein [a] (Lp[a]), ApoB-100 is synthesized and incorporated into VLDL and 
Lp[a] by the liver. Human LDL can be described as a spherical particle composed of a 
hydrophobic core of cholesterol esters and triglycerides encapsulated by an amphipathic 
monolayer of phospholipids, glycolipids and cholesterol in which the apoB-100 is partially 
5 imbedded (Myant, 1990). In addition to one molecule of apoB-100, LDL is known to contain 
varying numbers of apo C-I, apo C-II, apo C-III, apo E, and apo D (Blanco- Vaca et ai, 1992; 
Connelly et al, 1993; Blanco-Vaca et ai, 1994), 

The primary structure of apoB-100, SEQ ID N0:1 (FIG. lA-FIG. IC) has been 
10 determined by amino acid sequence analysis (Yang et al., 1986; Yang et al. 1989) and inferred 
from its cDNA sequence (Yang et al. 1986: Yang et al, 1989; Knott et al, 1986). There 
appear to be several different isoforms of apo B-100. The isoform shown in FIG. lA-FIG. IC 
is the isoform used for all of the alignments in the specification. Homologous regions in the 
other isoforms. however, would align similarly. 

15 

The apparent molecular weight of apoB-100 is 512 kDa based on its amino acid 
composition of 4536 residues. The apoprotein contains 25 Cys residues (Coleman et al, 1990; 
Yang. 1990), at least 16 of which form intramolecular disulfide bonds, with the remaining 
cysteines present as free sulfhydryls, as additional (unassigned) intramolecular disulfides, or as 
20 intermolecular disulfide linkages to other apolipoproteins (Blanco-Vaca et al, 1992; Connelly 
et al. 1993). Several important functional regions on apoB-100 that have been identified 
include heparin-binding sites (Cardin et al, 1987; Weisgraber and RalL 1987), glycosylation 
sites (Knott et al, 1986; Innerarity et al, 1986), and the LDL receptor-binding region (Blanco- 
Vaca et al, 1992, Knott et ai, 1986, Milne et al, 1989). 

25 

ApoB-100, and apolipoprotein E (apoE), apolipoproteins present in the low-density 
lipoprotein group, function as ligands for the high-affinity receptor-mediated removal of certain 
lipoproteins from plasma by the liver and delivery of cholesterol and cholesterol esters to a 
variety of target tissues (Myant, 1990; Innerarity et al, 1986; Brown and Goldstein, 1986; 
30 Mahley, 1988). A general mechanism for the receptor mediated uptake of LDL is well- 
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established (Myant, 1990; Innerarity et ai. 1986; Brown and Goldstein, 1986; Mahley, 1988), 
and the role of the apoB-100 molecule in this mechanism also is well defined. 

Specific binding of low density lipoproteins to their mammalian cell receptors depends 
5 on the presence and conformation of the apoB-100 ligands (Kinoshita et al^ 1990). Several 
reports have shown that the interaction of apoB-lOO-lipoproteins with the up-regulated, high 
affinity LDL (apoB/apoE) receptor is modulated by the lipid composition of the particle (Teng 
et al, 1985; Marcel et aL. 1988), by other apoproteins such as apo[a] in Lp[a] (Kostner and 
Grillhofer, 1991; Young et ai. 1986) and apoE in p-VLDL (Innerarity et al, 1986; Mahley, 
10 1988), and by monoclonal antibodies to specific regions of the apoB-lOO molecule (Innerarity 
et al., 1986; Young et ai. 1986). 



In searching the apoB-l 00 sequence for regions of sequence similarity to other proteins, 
nucleic acid binding regions (deoxyribonucleic acids, DNA and ribonucleic acids, RNA), 
15 nucleotide-binding regions, and nuclear-localization regions in the amino acid sequence of 
apoB-lOO and apoE, have been identified. The present invention demonstrates that highly 
purified preparations of human, rat. and baboon LDL bind specifically to pure preparations of 
human genomic DNA. These properties impart to the lipoproteins the capacity to serve as 
delivery vehicles for genetic material. 

20 

Lipoprotein particles carry a variety of vitamins and steroid compounds in their pseudo- 
micelle lipid core which may function in the control of gene expression. These attributes impart 
to the lipoproteins a virus-like character as well as capacity. While the inventors do not wish to 
be bound by any particular theory, the many control elements and signal motifs in the primary 

25 structure of the apolipoproteins are suggestive of the ability of these proteins to transport 
nucleic acids, enter the cell, participate in signal transduction, enter the nuclear space, initiate 
incorporation of nucleic acid materials into the resident genome, and cause its subsequent 
expression. As used herein, the term "primary structure" refers to the amino acid sequence of 
the protein. The capacity of purified LDL to bind to human genomic DNA, along with apoB- 

30 lOO's homology to SHU SH2, and SH3 signal transducer domains supports this hypothesis. 
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These properties of apoBlOO. and methods of exploiting these properties, are discussed in 
further detail below. 

2. NUCLEIC ACID BINDING REGIONS 

5 The inventors have found that apo B-lOO is also involved in DNA binding. DNA is the 

genetic blueprint that contains the information necessary for cell growth, differentiation, 
proliferation, and cellular response to environmental factors. The phenotypic differences 
between various cell types in higher eukaryotes are mainly due to differences in cellular gene 
expression. 

10 

The regulation of gene expression is predominantly controlled at the stage of initiation 
of transcription and is mediated by proteins which recognize specific DNA sequences. In order 
to recognize and bind to a specific DNA sequence a protein utilizes a structural motif Over the 
past 15 years, several strucmral DNA binding motifs have been identified including as zinc 
15 fingers, helix-tum-helix. basic helix-loop-helix, KH RNA-binding motifs and leucine zippers 
and proline pipe helices. The inventors report here the identification of regions in apo B-lOO 
with homology to various DNA binding motifs including: 1) Proline pipe helix DNA binding 
motifs, 2) ISGFSy-like DNA binding motifs, 3) SREBP-like DNA binding motifs, 4) coiled-coil 
motifs, and 5) nucleotide (ATP)-binding motifs, 

20 

a. Nucleotide and ATP Binding Motifs 

The inventors discovered that that there is a certain degree of homology between regions 
of apo B-lOO and known ATP binding motifs found in other proteins including those involved 
in signal transduction and transcriptional-ribonucleotide synthesis (t-RNA synthetases. 
25 Typically, these proteins contain sites which interact with different regions of the nucleotide. 
I.e., negatively charged phosphate regions, the ribose (carbohydrate) hydroxy 1 groups, and the 
base. A second site binds to the substrate ligand such as any amino acid in the case of t-RNA 
synthetases and tyrosine, serine and threonine residues in the phosphorylation of proteins. 

30 Examination of the apoB-100 primary structure reveals several regions which are similar 

in sequence to the known nucleotide and ATP binding motifs and are suggestive of a similar 
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function. For example, ATP-binding sites are known to contain an essential ATP-binding 
lysine residue. In lyn, the site is T269KVAVTLKPG (SEQ ID NO:54) and in lyk, it is 
D386KVAIKTIREG (SEQ ID NO:55). A similar region can be found in apoB-100, 
DLNAVANKIAD (SEQ ID NO:56). The similarity of this region in apo B-IOO with the ATP- 
5 binding sites on known tyrosine-kinases suggests that apo B-lOO can bind to the nucleic acid, 
ATP. 

A single ATP-binding region occurs between residues 3800 and 3840 which is located 
in the kinase domain of apoB-100. The sequence of this region with known ATP-binding 

10 regions of kinases is shown in FIG. 12A-FIG. 12C. FIG. 12A-FIG. 12C show a comparison of 
known ATP-binding loop motifs to similar regions in apo B-100. Bold letters indicate 
conserved amino acids, critical amino acids (H and K) are indicated by the #, indicates 
conserved amino acids, indicates gaps introduced in the sequence in order to align the 
proteins, and identical amino acids between the sequences in "C" are listed below the 

15 alignment. Sequence identification numbers are listed in the right margin. The criucal lysine 
residue is retained and the degree of similarity suggests a like function. 

The ATP-binding motifs typical of t-RNA synthetases are characterized by the signature 
sequence HIGH (histidine, isoleucine, glycine histidine) SEQ ID NO: 177, and a second motif 
20 which contains a critical lysine residue. These motifs are located within 300 residues and occur 
as proximal loops on the surface of the protein molecule. Several analogues of this signature 
sequence occur in the apoB-100 sequence (see FIG. 7 and FIG. 12A-FIG. 12C). An extended 
comparison of apoB-100 regions which contain the HIGH signature sequence is made with the 
tyrosyl-tRNA synthetase sequence shown in FIG. 12A-FIG. 12C. 

25 

b. Proline Pipe Helix Structures 

The proline pipe helix is usually present in proteins that contain proline every fifth 
position (Myant, 1990) in the amino acid sequence that is at least 20 residues long (PXXXXP)n 
(SEQ ID NO:75) where n>4. In the proline pipe helix, 5.56 residues are required to make one 
30 complete left handed helical turn. The proline pipe helix is stabilized by a hydrogen bonding 
network between the C=0 groups of residues in positions i+ 1, i+2, i+3 (where i is a proline or 
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sometimes non-proline residue) with the NH groups in positions i+2, i+3, i+4, respectively, of 
the following turn (Blanco- Vaca ei ai, 1992). The unusually large turn of the helix results in 
the formation of a channel running along the helix that is about 6 A in average diameter (Myant, 
1990) and large enough to accommodate water (Blanco- Vaca et al, 1992) and possibly other 
S molecules. 

One function of the proline pipe helix is DNA binding. For example, the proline pipe 
helix in Tus is involved in tight binding to highly specific 22-23 base pair DNA known as Ter 
sites (Connelly et al, 1993; Blanco-Vaca et al, 1994). Because of its large diameter compared 

10 to the a-helix. the proline pipe helix spans the entire width of the major groove (Blanco-Vaca et 
ai. 1992) and results in a tight and highly specific fit. This tight fit also results in a high 
correspondence between the positively charged amino acid residues of the proline pipe helix 
and the negatively charged phosphate groups of DNA (Blanco-Vaca et ai, 1992). The 
occurrence of the proline pipe-DNA interactions in nature might be more widespread than 

15 presently thought and this interaction might play a very important biological function. 

Careful examination and analysis of the apoB-100 amino acid sequence shows that the 
40-residue proline-rich segment P2682-I2719, or a portion of this segment, assumes a proline 
pipe helical conformation (see FIG. 7), PDFRLPEIAIPEFIIPTLNLNDFQVPDLHIPEFQ 
20 LPHISH (SEQ ID NO:76). Because the unique features of the proline pipe helix make it 
suitable for tight and highly specific DNA binding, this segment or motif in apoB-100 
constitutes one of the DNA binding sites. 

The functional implications of DNA binding by apoB-100 include, but are not limited 
25 to: 1) binding of DNA such as, for example, microsatellite DNA (Connelly et al, 1993; 
Blanco-Vaca et al, 1994) to apoB-100 or its fragment(s) for DNA transport from the cytoplasm 
to the nucleus; (2) binding of apoB-100 or its fragment(s) to the nuclear DNA to regulate 
transcription or effect other functions; or (3) binding of DNA to apoB-100 or its fragment(s) to 
transport DNA from the nucleus to the cytoplasm. Other functions as a consequence of apoB- 
30 100 DNA binding through the apoB-100 proline pipe helix are not precluded. Therefore, the 
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proline pipe region of apoB-100 constitutes an important target for structure-based drug design 
and delivery systems. 

c ISGF3y-like DNA binding motifs 
5 ISGF3 is a multimeric transcription factor involved in the regulation of transcription of a 

large set of genes. This factor dissociated into two protein components termed ISGFSy and 
ISGF3a. ISGFSy is a 48 kDa protein that binds DNA recognizing the IFN-stimulated response 
element. ISGF3a does not bind DNA. Regions in apoB-100 have been found to be 
homologous to the DNA-binding domain of ISGF37 (FIG. 8A-FIG. 8D and FIG. 9A-FIG. 9D). 

10 

FIG. 8A-FIG. 8D show a homology alignment among one region of the DNA-binding 
protein ISGF3y and similar regions in apo B-100. Basic amino acids are indicated in bold and * 
indicates conserved amino acids between the two regions and V indicates conserved amino 
acids that have switched positions between the two sequences aligned. Sequence identification 
15 numbers are identified in the legend to the figure. 

FIG. 9A-FIG. 9D show a homology alignment among regions of the DNA-binding 
protein ISGF3y and similar regions in apo B-100. Basic amino acids are indicated in bold, 
indicates gaps introduced in the sequence in order to align the two proteins. Sequence 
20 identification numbers are identified in the right margin. 

This indicates apoB-lOO can bind specific DNA sequences in a maimer similar to 
lSGF3y. 

25 d. SREBP-Like DNA Binding Motifs 

Another region within apoB-100 shows striking resemblance to the DNA binding 
domains of previously identified sterol regulatory element binding proteins (SREBPs; FIG. 
lOA and FIG. lOB). A sequence comparison of the DNA-binding domains of the SREBPl, 
SREBP2, ADDl proteins with similar regions foimd in apo B-100 are shown in FIG. lOA 
30 where basic amino acids are indicated in bold, indicates conserved amino acids, 

indicates gaps introduced in the sequence in order to align the two proteins, and identical amino 
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acids between the two sequence are listed below the alignment, FIG. lOB shows a sequence 
comparison of the DNA-binding domains of SREBPl with various apolipoproteins where basic 
amino acids are indicated in bold, indicates conserved amino acids, indicates gaps 
introduced in the sequence in order to align the two proteins, V indicates conserved amino acids 
5 that have sv^tched positions between the two sequences aligned, and identical amino acids 
between the two sequences are listed below the alignment. Sequence identification numbers are 
indicated in the legend to the figure. The full line of "*************" separates the different 
sequence alignments. 

10 SREBP's are members of the basic helix-loop-helix-leucine zipper (bH-L-H-Zip) family 

of transcription factors and play a major role in the transcriptional regulation of a number of 
genes involved in cholesterol homeostasis as well as lipid biosynthesis. SREBP's contain 3 
segments: 1) an NH2 terminal bH-L-H-Zip DNA binding domain including an acidic 
transcription activating domain; 2) a middle segment containing two membrane spanning 

15 domains; and 3) a COOH terminal segment In order for SREBP's to become functionally 
active transcription factors, their NH2 terminal domain containing the bH-L-H-Zip region needs 
to be released from the endoplasmic reticulum or nuclear envelope. This process is mediated by 
a sterol-regulated protease. That apo B-lOO, like the SREBP's, binds DNA. 

20 e. Coiled-coil Motif (Leucine Zipper) 

The coiled-coil motif (Myant, 1990), sometimes referred to as the leucine zipper 
(Blanco-Vaca et al, 1992), is characterized by two a-helical chains that v^ap around each other 
to form a left-handed supercoil. The amino acid sequence of coiled-coil forming proteins is 
characterized by the presence of heptad repeats, that is, three or more repeats of a seven-residue 

25 sequence where every third and every fourth position in the heptad is occupied by a 
hydrophobic residue (Blanco- Vaca et al, 1992; Connelly et al, 1993; Blanco-Vaca et al, 
1994). The two a-helical chains that form the coiled-coil can align either in parallel or anti- 
parallel orientation and their stabilities are dependent on the presence of strategically located 
hydrophobic and electrostatic interactions (Yang et al^ 1986; Yang et aL, 1989; Knott et ai, 

30 1986; Coleman et aL, 1990: Yang, 1990; Cardin et aL, 1987; Weisgraber and Rail, 1987; 
Innerarity et aL, 1986; Milne e/ a/., 1989; Brov^ and Goldstein, 1986). The most attractive 
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feature of the coiled-coil is that highly specific interactions can be tailored by redesigning this 
relatively simple motif. 

The coiled-coil motif occurs widely in native proteins (Lupas et al., 1991; Cohen and 
S Parry, 1986). It plays structural and functional roles in fibrous proteins such as keratin, myosin, 
elastin, fibrinogen, tropomyosin, eic. The coiled-coil motif also serves as the dimerization 
domain for a number of transcription factors such .as GCN4 (O'Shea et al. 1991; Ellenberger et 
al, 1992). GAL4 (Kraulis et al, 1992; Baleja and Sykes, 1991; Marmorstein et al, 1992), c- 
Fos-c-Jun (Glover and Harrison, 1995), where only the dimeric form binds to DNA and is 
10 active. It is found in globular proteins, such as tRNA synthetase (Cusack et ai, 1990; Biou et 
oL, 1994), and serves as anchors into the tRNA. Naturally occurring coiled-coils can also be 
found as three-stranded (Bullough et ai, 1994a; BuUough et ai, 1994b) or four-stranded 
(Banner et ai. 1987) structures. 

15 Sequence alignment analysis of apoB-100 predicts that there are at least eight coiled-coil 

structures of varying lengths in different regions of its amino acid sequence (FIG. 1 1). FIG. 1 1 
shows a comparison of the primary structures of known coiled-coil regions of DNA-binding 
proteins and analogous regions in apo B-100. Bold letters indicate conserved amino acids. 
Sequence identification numbers are listed in the right margin. 

20 

While the inventors do not wish to be bound by any particular theory, it is likely that 
these coiled-coil domains play very important structural and functional roles that, in turn, are 
vital to the function of LDL. For example, the coiled-coil motif can serve as dimerization or 
multimerization sites that may be important in LDL solubilization or aggregation. The coiled- 

25 coil motif can also bind DNA, RNA or nucleotides and, therefore, plays a very important role in 
the regulation and energetics of protein synthesis. The coiled-coil motif can also serve as a 
template for transport of molecules within and between the cytoplasm and the nucleus. In 
addition, the coiled-coil motif can also serve as a (temporary) reservoir of ligands that may be 
important in the regulation of the metabolic pathways. This list is by no means exhaustive, but 

30 demonstrates the biological importance of the coiled-coil motif in apoB-1 00. 
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The discovery of the coiled-coil motif in apoB-100 and the important biological 
implications of its presence, apoB-100 by itself or as part of LDL, constitutes an important 
target for structure-based drug design, delivery, and diagnostic systems. Coiled-coil formihg 
sequence in apoB-100 (as indicated in FIG. 11) can be used to design, study and manufacture 
S coiled-coil based peptide or protein delivery systems for drugs, radioisotopes, oligonucleotides, 
genes, antigens, antibodies, epitopes for vaccines, sugars, carbohydrate analogs and other 
ligands to specific targets in cells, tissues and organs. Either single strand or multiple strands of 
the apoB-100 coiled-coil forming peptide sequences that can be used as components of or 
attached to the aforementioned ligands either by covalent or non-covalent methods. 

10 

Coiled-coil forming sequences in apoB-100 (FIG. 11), or fragments, analogs, or 
modifications therefore can be used as site-specific targets for the delivery of drugs, 
radioisotopes, oligonucleotides, genes, antigens, antibodies, epitopes for vaccines, sugars, 
carbohydrate analogs and other ligands. Site-specific targeting includes the use of coiled-coils, 
15 coiled-coil forming peptides, or any functional group that binds to the aforementioned coiled- 
coils sequences in apoB-100. 

3. NUCLEAR LOCALIZATION SIGNALS 

In addition to homology with DNA binding proteins. apoB-100 contains several regions 
20 that are homologous to known nuclear localization signals (FIG. 13A-FIG. 13E). These signals 
include the NLS from human p53, AbK and apoJ. FIG. 13A-FIG. 13E show a comparison of 
known nuclear localization signal sequences to similar regions in apo B-100. 

The bipartite nuclear localization signal contains two essential elements comprised of 
25 basic amino acids, H (histidine), R (Arginine), and K (Lysine) which are required for nuclear 
targeting. The signal motifs starts with two basic amino acids which are then followed by a ten 
to thirty amino acid spacer and a basic duster of five amino acids three of which must be basic. 
Approximately 50% of the known nuclear proteins listed in the protein databases have this 
motif, while less than 5% of non-nuclear proteins have it. FIG. 13A and FIG. 13B show 
30 sequences in apoB-100 with the perfect 10 amino acid spacer between the bipartite nuclear 
localization sequence element. 
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There is no strict requirement for the spacer length other than perhaps flexibility in the 
amino acids, Le,^ the dihedral angles. Indeed, there are basic amino acid clusters in the apo B- 
100 molecule that are separated by longer spacers and are nevertheless potential DNA-binding 
5 regions. FIG. 13C shows sequences in apoB-100 with more or less than 10 amino acids in the 
spacer region between the bipartite nuclear localization sequence element, and FIG. 13D- 
FIG. 13E show sequences in apoB-100 with more or less than 10 amino acids in the spacer 
region between an imperfect "bipartite" nuclear localization sequence element. 

10 Thus, these regions in apoB-100 are NLS sequences capable of directing DNA to the 

nucleus of a cell. Apolipoprotcins present on human LDL can bind to DNA through the DNA 
binding motifs identified herein. The functional bH-L-H-Zip domain within apoB-100 can 
enter the nucleus, following proteolytic release and/or aided by the nuclear localization signal 
domains present within the apolipoproteins, and regulate transcription of the target genes. 

15 

In addition, apo B- 1 00 appears to be conserved across species. FIG. 14A-FIG. 14J show 
various regions of human apo B-lOO aligned with the sequenced fragments of the apo B-lOO 
from pig, rat, hamster, mouse, chicken and rabbit. Bold and underlined letters indicate 
positively charged, basic amino acids, and "-" indicates gaps introduced in the sequence in order 
20 to align the proteins: 

4. HOMOLOGY TO SIGNAL TRANSDUCING PROTEINS 

The inventors have found that in addition to homology with nuclear localization signals 
and DNA binding proteins, apoB-100 molecule has regions of sequence similarity to known 

25 motifs in a variety of signal transduction molecules. For example, regions of apo B-lOO are 
homologous to src homology 3 (SH3) (FIG. 2A-FIG. 2F), src homology 2 (SH2) (FIG. 3A- 
FIG. 3D) and src homology 1 (SHI) (FIG. 4A-FIG. C) kinase domains that are common to 
protein tyrosine kinases of the signal transduction system (Koch et aL, 1991; Pawson, 1992; 
Schlessinger, 1994; Margolis, 1992; Waksman et al, 1993; Carpenter, 1992; Ugi et aL, 1994; 

30 Lowenstein et aL, 1992; Guevara, Jr. et al, 1994), as well as activation regions located at the 
amino-and carboxyl- termini of signal transduction proteins (FIG. 6A and FIG. 6B). 
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FIG. 2A-FIG. 2F is a homology alignment of SHS-like regions in apo B-lOO with 
known SH3 domains of signal transduction proteins, where indicates conserved amino 
acids, indicates gaps introduced in the sequence in order to align the two proteins, identical 
5 amino acids between the two sequences are listed below the alignment, and percent similarity is 
indicated in the right margin. This alignment is followed by a table identifying the regions of 
apoB-100 and the various proteins aligned to these regions along with their respective sequence 
identification numbers. 

10 FIG. 3A-FIG. 3D show a comparison of SH2-like regions in apo B-lOO to known SH3 

domains of signal transduction proteins, where structurally important motifs are indicated by 
double underline, basic amino acids are indicated in bold, indicates conserved amino acids, 
"-" indicates gaps introduced in the sequence in order to align the two proteins, identical amino 
acids between the two sequences are listed below the alignment, and percent similarity is 

15 indicated in the right margin. The alignment is followed by a table identifying the reference 
proteins and regions of apoB-100 in the alignment along with their sequence identification 
numbers. 

FIG. 4 shows a comparison of the apo B-lOO SHI -like region to SHI kinase domains of 
20 known signal transduction proteins where basic amino acids are indicated in bold, "*" indicates 
conserved amino acids. indicates gaps introduced in the sequence in order to align the two 
proteins, and identical amino acids between the two sequences are listed above the alignment. 
The alignment is followed by a table identifying the reference proteins and the region of apoB- 
100 used for the alignment along with their respective sequence identification numbers. 

25 

FIG. 6 A and FIG. B show a homolog alignment of specific regions of apo B-lOO and the 
activation regions located at the amino* and carboxyl- termini of signal transduction proteins 
where"*" indicates conserved amino acids, indicates gaps introduced in the sequence in 
order to align the two proteins, and identical amino acids between the two sequences are listed 
30 above the alignment. Numbers in parenthesis indicate amino acid residues shown in the 
alignment and sequence identification numbers are listed in the right margin. 
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Discovery of these motifs in the apoB-100 sequences was based on a series of reports 
(Ye et al, 1988; Trieu and McConathy, 1990; Trieu et cf/,, 1991) which showed that free proline 
inhibited binding of recombinant apo[a] to both Lp[a] and LDL. These results implied that 
5 proline within the apoB-100 sequence interacted with the kringle binding pocket. Molecular 
modeling was used to determine if proline, is a ligand for the different apo[a] kringle types 
(Guevara, Jr. et al, 1993). These studies concluded that although free proline can be 
accommodated by the ligand binding site of several apo[a] kringle types, proline located within 
a polypeptide chain probably does not fit into any of the ligand binding sites of apo[a] kringles. 
10 As an alternative possibility, proline might bind at an allosteric site on the kringle structure 
(Guevara, Jr. et ai, 1993), and thereby alter the ligand binding site of the kringle. A second 
possibility is that apo[a] kringles are not involved at all, but rather that the proline/threonine- 
rich inter-kringle regions (IKR's) associate with specific sites on apoB-100, and thereby enable 
recombinant apo[a] to bind to Lp[a] and LDL. 

15 

a. The SH3 Domain 

The interkringle regions of Apo [a] have homology to 3BP1 (FIG. 5). FIG. 5 shows the 
inter-kringle proline-rich regions of Apo[a] compared with the proline rich region of SH3' 
binding protein (3BP1) where the conserved prolines are indicated in bold and indicates 
20 gaps introduced in the sequences in order to align the two proteins. Following the alignments is 
a table identifying the inter-kringle proline-rich regions of Apo[a] and the proline-rich region of 
3BP1 used for the alignment along with their respective sequence identification numbers. 

Apo[a] is a hydrophilic, glycosylated apoprotein that is disulfide-linked to apo B-lOO in 
25 the Lipoprotein[a] particle. The proline-rich hinge between kringle structures of the apo[a] are 
suggestive a of role in signaling. Cicchetti e/ c/. (1992) and Ren e/ a/. (1993) described a ten 
amino acid, proline-rich segment of the 3BP-1 protein which binds to an SH3 domain in Abl, a 
non-receptor protein tyrosine kinase involved in signal transduction. The proline-rich IKR's in 
apo[a] (McLean et ai, 1987; Guevara, Jr. et al, 1992), like those in 3BP-1, contain the 
30 sequence PXP (SEQ ID N0:2) which is important for the interaction of these motifs with their 
corresponding SH3 domains. 
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Proline-rich binding proteins (BP*s), SH3, and SH2 domains are regulatory domains in 
signaling proteins which mediate enzymatic activity, participate in intracellular protein-protein 
interactions, and bind to activated receptor protein-tyrosine kinases (Koch ei a/., 1991 ; Pawson, 
5 1992; Schlessinger, 1994; Margolis, 1992; Waksman et al, 1993; Carpenter, 1992; Ugi et al, 
1994; Lowenstein et ai, 1992: Guevara, Jr. et ai, 1994; Pleiman et ai. 1994). The sequence 
similarities noted between apo[a] IKR's and the proline-rich segment of 3BP-1 suggest a similar 
function for these regions of the apo[d] in non-covalent interactions between apo[a] and apoB- 
100, i.e., binding of a proline-rich region in apo[a] to an SHB-like region in apoB-100. 

10 

In apoB-100, at least 13 regions share high sequence similarities with SH3 domains. 
SH3 domains are found in several signal transduction proteins such as phophatidylinositol-3' 
kinase (PI3K) and the non-receptor tyrosine kinase Abl (see FIG. 1 and FIG. 4). This suggests 
that apo B-lOO may have signal transduction properties. 

15 

b. The SH2 Domain 

Many signal transduction proteins and other proteins such as tyrosine phosphatases and 
tensin also contain SH2 domains (Koch et al, 1991; Pawson. 1992: Schlessinger, 1994; 
Lowenstein et ai, 1992). often flanked by SH3 domains. SH2 domains are typically comprised 

20 of about 100 amino acids. In the signaling process, SH2 domains bind to specific 
phosphotyrosine motifs of target proteins (Songyang et al, 1993; Escobedo et ai, 1991). The 
apoB-IOO sequence was examined for presence of SH2-like regions and numerous regions in 
the apoB-100 sequences were found to share some commonalties with SH2 domains of 
signaling proteins (FIG. 3A-FIG. 3D). This suggests that apoB-100 may interact with 

25 phosphorylated proteins through SH2-like regions. 

c. The SHI Domain 

Typically, signal transduction proteins also contain a kinase domain or src homology 
domain 1 (SHI) which is located in the carboxyl region of the protein and is comprised of about 
30 300 amino acids (Rudd et ai, 1993). SHI domains are highly homologous. Regions of apo B- 
100 have been found that share homology with SHI domains (FIG. 4). In addition, apo B-lOO 
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shares homolog with the catalytic loop or active site motif in these signaling proteins. For 
example, the active site motif of lyn (EC 2.7.M 12) is R359KNYIHRDLRAAN (SEQ ID 
NO:52); a sequence that is highly conserved. An analogous region is found in apoB-100, 
K3919GTLAHRDFSAE (SEQ ID NO:53). 

5 

Furthermore, apo B-lOO shares amino acid sequence homolog with the activation 
regions located at the amino- and carboxyl- termini of signal transduction proteins (FIG. 6A and 
FIG. 63). Protein Kinase C and c-AMP-dependent kinase control sites are present at the 
amino-terminus of signal transduction proteins. Tyrosine kinase control sites are located in the 
10 carboxyl-terminus of these proteins. Typically, there is little sequence homology, at the amino- 
termini, but high homology is common at the carboxyl-termini of signaling protein kinases. 

Regions of homology, within apo B-lOO having sequence similarity to SH3, SH2 and 
SHI domains and other cell signaling proteins, all point to the possibility that apo B-lOO is 
1 S involved in intracellular signaling. 

5. PROTEIN EXPRESSION 

As described above, the inventors have discovered that a particular region of the apoB- 
100 molecule is similar in sequence to the Steroid Regulatory Element Binding Proteins, 

20 SREBPl and 2 and ADDl. Other regions of the apoB-100 molecule are similar to specific 
regions in other known DNA binding proteins including, but not limited to ISGFSy, coiled-coil 
regions of GCN4 and hMLKL and the proline-pipe sequences of Tus. Further, the inventors 
found that the amino acid sequence of apolipoproteins, such as apoB-100 have regions involved 
with nucleotide binding and nuclear localization. For example, apolipoproteins such as apoB- 

25 100 show homology to the SHI kinase domains of protein tyrosine kinases and the HIGH and 
KMSK motif plus critical lysine of tRNA synthetases both known to bind ATP as well as to the 
basic helix-loop-helix motif of sterol regulatory element binding proteins (SREBPs) known to 
localize to the nucleus where they are involved in the regulation of transcription. 
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a. Expression of apoBlOO 

In certain embodiments of the present invention, it will be necessary to obtain apoBlOO or 
lipoproteins containing apoBlOO for use as DNA binding compositions. In particular 
embodiments as described herein below, such apoBlOO may be obtained from the lipoprotein 
5 fraction of primate serum. As an alternative to purifying apoB 1 00 from LDL fraction of serum, it 
is possible to generate pure fractions of apoB-100 by recombinant expression of the apoBlOO 
gene. The apoB 1 00 gene can be inserted into an appropriate expression system. The gene can be 
expressed in any number of different recombinant DNA expression systems to generate large 
amounts of the polypeptide product, which can then be purified and used as a DNA binding 
1 0 composition as described herein. 

In one embodiment, specific amino acid sequence domains of an apoBlOO polypeptide 
having for example, the sequence of SEQ ID NO: 1 , can be prepared. These may, for instance, be 
minor sequence variants of a polypeptide that arise due to natural variation within the population 
1 5 or they may be homologues found in other species. They also may be sequences that do not occur 
naturally but thai are sufficiently similar that they function similarly and/or elicit an immune 
response that cross-reacts with natural forms of the polypeptide. 

The nucleotide binding, nuclear localization and signal transduction domains of the 
20 apoBlOO molecule are discussed in detail herein below. Recombinant technologies, well 
known to those of skill in the art, may be used to produce recombinant apoBlOO with one or 
more of these domains having sequences that optimize the DNA binding and/or nuclear 
localization capacities of the molecule. Furthermore, in certain instances it may be necessary to 
"customize" such domains in order to increase binding to a particular DNA sequence whilst 
25 decreasing the binding to other sequences. Alternatively, it may be preferable to alter a 
particular apoBIOO polypeptide, in order to decrease its binding affinity for a particular 
molecule. Accordingly, sequence variants of these domains can be prepared by standard methods 
of site-directed mutagenesis such as those described below in the foUovwng section. 
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Amino acid sequence variants of an apoBlOO polypeptide, or particular domains therein 
can be substitutional, insertional or deletion variants. Deletion variants lack one or more residues 
of the native protein which are not essential for function or immunogenic activity. 

5 Substitutional variants typically contain the exchange of one amino acid for another at one 

or more sites within the protein, and may be designed to modulate one or more properties of the 
polypeptide such as stability against proteolytic cleavage. Substitutions preferably are 
conservative, that is, one amino acid is replaced with one of similar shape and charge. 
Conservative substitutions are well known in the art and include, for example, the changes of: 

10 alanine to serine; arginine to lysine; asparagine to glutamine or histidine; aspartate to glutamate; 
cysteine to serine; glutamine to asparagine: glutamate to aspartate; glycine to proline; histidine to 
asparagine or glutamine; isoleucine to leucine or valine; leucine to valine or isoleucine; lysine to 
arginine; methionine to leucine or isoleucine; phenylalanine to tyrosine, leucine or methionine; 
serine to threonine; threonine to serine; tryptophan to tyrosine; tyrosine to tryptophan or 

1 5 phenylalanine; and valine to isoleucine or leucine. 

Insertional variants include fusion proteins such as those used to allow rapid purification 
of the polypeptide and also can include hybrid proteins containing sequences from other proteins 
and polypeptides which are homologues of the polypeptide. For example, an insertional variant 

20 could include portions of the amino acid sequence of the polypeptide from one species, together 
with portions of the homologous polypeptide from another species. Other insertional variants can 
include those in which additional amino acids are introduced within the coding sequence of the 
polypeptide. These typically are smaller insertions than the fusion proteins described above and 
are introduced, for example, into a protease cleavage site. Altematively , insertional variants of the 

25 present invention may be created in which one or more DNA binding domains and nuclear 
localization domain have been added to a native apoBlOO molecule to alter particular 
characteristicsof the molecule. 

In one embodiment, major antigenic determinants of the polypeptide are identified by an 
30 empirical approach in which portions of the gene encoding the polypeptide are expressed in a 
recombinant host, and the resulting proteins tested for their ability to elicit an immime response. 
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For example, PCR can be used to prepare a range of cDNAs encoding peptides lacking 
successively longer fragments of the C-ierminus of the protein. The immunoprotective activity of 
each of these peptides then identifies those fragments or domains of the polypeptide that are 
essential for this activity. Further experiments in which only a small number of amino acids are 
removed at each iteration then allows the location of the antigenic determinants of the 
polypeptide. 

Another embodiment for the preparation of polypeptides according to the invention is the 
use of peptide mimetics. Mimetics are peptide-containing molecules that mimic elements of 
protein secondary structure. See, for example, Johnson ei ai, "Peptide Tum Mimetics" in 
BIOTECHNOLOGY AND PHARMACY. Pezzuto et aL Eds.. Chapman and Hall, New York 
(1993). The underlying rationale behind the use of peptide mimetics is that the peptide backbone 
of proteins exists chiefly to orient amino acid side chains in such a way as to facilitate molecular 
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interactions, such as those of antibody and antigen. A peptide mimetic is expected to permit 
molecular interactions similar to the natural molecule. 

Successful applications of the peptide mimetic concept have thus far focused on mimetics 
5 of p-tums within proteins, which are known to be highly antigenic. Likely p-tum structtire within 
an polypeptide can be predicted by computer-based algorithms as discussed above. Once the 
component amino acids of the turn are determined, peptide mimetics can be constructed to achieve 
a similar spatial orientation of the essential elements of the amino acid side chains. 

10 Modification and changes may be made in the structure of a gene and still obtain a 

functional molecule that encodes a protein or polypeptide with desirable characteristics. The 
following is a discussion based upon changing the amino acids of a protein to create an equivalent, 
or even an improved, second-generation molecule. The amino acid changes may be achieved by 
change the codons of the DNA sequence, according to the following data. 

15 

For example, certain amino acids may be substituted for other amino acids in a protein 
structure without appreciable loss of interactive binding capacity with structures such as, for 
example, antigen-binding regions of antibodies or binding sites on substrate molecules. Since it is 
the interactive capacity and nature of a protein that defines that protein's biological functional 
20 activity, certain amino acid substitutions can be made in a protein sequence, and its underlying 
DNA coding sequence, and nevertheless obtain a protein with like properties. It is thus 
contemplated by the inventors that various changes may be made in the DNA sequences of genes 
without appreciable loss of their biological utility or activity. 

25 In making such changes, the hydropathic index of amino acids may be considered. The 

importance of the hydropathic amino acid index in conferring interactive biologic function on a 
protein is generally understood in the art (Kyte Doolittle, 1982). 
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TABLE 1 



Amino Acids 


Codons 


Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 






Cysteine 


Cys 


C 


UGC 


UGU 










Aspartic acid 


Asp 


D 


GAC 


GAU 










Glutamic acid 


Giu 


E 


GAA 


GAG 










Phenylalanine 


Phe 


F 


UUC 


UUU 










Glycine 


Gly 


G 


GGA 


GGC 


GGG 


GGU 






Histidine 


His 


H 


CAC 


CAU 










Isoleucine 


He 


1 


AUA 


AUC 


AUU 








Lysine 


Lys 


K 


AAA 


AAG 










Leucine 


Leu 


L 


UUA 


UUG 


CUA 


cue 


CUG 


CUU 


Methionine 


Met 


M 


AUG 












Asparagine 


Asn 


N 


AAC 


AAU 










Proline 


Pro 


P 


CCA 


CCC 


CCG 


ecu 






Glutamine 


Gin 


Q 


CAA 


CAG 










Aiginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGU 


Serine 


Ser 


S 


AGCAGU 


UCA 


UCC 


UCG 


UCU 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACU 






Valine 


Val 


V 


GUA 


GUC 


GUG 


GUU 






Tryptophan 


Trp 


W 


UGG 












Tyrosine 


Tyr 


Y 


UAC 


UAU 











It is accepted that the relative hydropathic character of the amino acid contributes to the 
secondary structure of the resultant protein, which in turn defines the interaction of the protein 
5 with other molecules, for example, enzymes, substrates, receptors, DNA, antibodies, antigens, 
and the like. 

Each amino acid has been assigned a hydropathic index on the basis of their 
hydrophobicity and charge characteristics (Kyte & Doolittle, 1982), these are: Isoleucine (+4.5); 
10 valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine/cystine (+2.5); methionine (+1.9); 
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alanine (+1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); tryptophan (-0.9); tyrosine (-1.3); 
proline (-1.6); histidine (-3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); asparagine (- 
3.5); lysine (-3.9); and arginine (-4.5). 

5 It is known in the art that certain amino acids may be substituted by other amino acids 

having a similar hydropathic index or score and still result in a protein with similar biological 
activity, /.e., still obtain a biological functionally equivalent protein. In making such changes, 
the substitution of amino acids whose hydropathic indices are within ±2 is preferred, those 
which are within ±1 are particularly preferred, and those within ±0.5 are even more particularly 
10 preferred. 

It is also understood in the art that the substitution of like amino acids can be made 
effectively on the basis of hydrophilicity. U.S. Patent 4.554,101, incorporated herein by 
reference, states that the greatest local average hydrophilicity of a protein, as governed by the 
15 hydrophilicity of its adjacent amino acids, correlates with a biological property of the protein. 

As detailed in U.S. Patent 4,554,101, the following hydrophilicity values have been 
assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 ± I); glutamate 
(+3.0 ± 1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (-0.4); 
20 proline (-0.5 ± 1); alanine (-0.5); histidine *-0.5); cysteine (-1.0); methionine (-1.3); valine (- 
1.5); leucine (-1 .8); isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4). 

It is understood that an amino acid can be substituted for another having a similar 
hydrophilicity value and still obtain a biologically equivalent and immunologically equivalent 
25 protein. In such changes, the substitution of amino acids whose hydrophilicity values are 
within ±2 is preferred, those that are within ±1 are particularly preferred, and those within ±0.5 
are even more particularly preferred. 

As outlined above, amino acid substitutions are generally based on the relative similarity 
30 of the amino acid side-chain substituents, for example, their hydrophobicity, hydrophilicity, 
charge, size, and the like. Exemplary substitutions that take various of the foregoing 
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characteristics into consideration are well known to those of skill in the art and include: arginine 
and lysine; glutamate and aspartate; serine and threonine; glutamine and asparagine; and valine, 
leucine and isoleucine. 

S b. apoBlOO Variants 

In order to determine the optimal DNA-binding sequences, recombinant fragments of 
apoB-100 or other apolipoproteins may be used in mobility shift assays or other common 
protein-DNA interaction assays, including, but not limited to, methylation interference assays, 
DNase-I footprinting sissays, UV-crosslinking assays, Biotin/Streptavidin affinity systems, or 
10 screening expression libraries encoding DNA-binding proteins. The recombinant 
apolipoprotein fragments are expressed by cloning these cDNA fragments in commercially 
available eukaryoiic expression vectors and employing recombinant DNA expression 
techniques well known to the art. 

In addition, the recombinant fragments may be mutated by employing site-directed 
mutagenesis or oligonucleotide-directed mutagenesis techniques in order to improve their 
affinity for nucleic acids and used either in their original or mutated form. Mutations in the 
recombinant apolipoprotein fragments may include, but are not limited to, addition of 
endosomolytic and/or nuclear localization peptide sequences employing common recombinant 
DNA technolog}'. The recombinant protein fragments are prebound to the nucleic acids of 
interest prior to their reassembly into freshly isolated lipoproteins and subsequent transfection. 
Alternatively, they are reassembled into lipoproteins prior to in vitro nucleic acid binding and 
subsequent transfection. Separation of protein-bound DNA from free DNA may be required 
prior to transfection and is accomplished by adsorption to nitrocellulose membranes or other 
common techniques including, but not limited to size-exclusion or density ultracentrifugation. 

Site specific mutations can be made within the proposed DNA binding motifs or nuclear 
localization signal sequences of the apolipoproteins described in this invention, in order to 
improve their homology with known DNA binding motifs (e.g., SREBP-like DNA-binding 
30 motifs, ISGF3y-like DNA-binding motifs) and nuclear localization signal sequences {e.g., NLS 
from human p53, Ap 1, IGFBP-3, ir, and apo J). Specific mutations in the DNA sequences of 
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Steroid regulatory elements (SRE) and IFN-stimuiated response elements which affect the 
DNA-binding affinity of SREBP and ISGF3y, respectively, have been described (Smith et ai, 
1990; Briggs et aL, 1993; Wang et ai, 1993; Veals et al, 1992). 

S Site-specific mutagenesis is a technique useful in the preparation of individual peptides, 

or biologically functional equivalent proteins or peptides, through specific mutagenesis of the 
underlying DNA. The technique further provides a ready ability to prepare and test sequence 
variants, incorporating one or more of the foregoing considerations, by introducing one or more 
nucleotide sequence change(s) into the DNA. Site-specific mutagenesis allows the production 

10 of mutants through the use of specific oligonucleotide sequences which encode the DNA 
sequence of the desired mutation^ as well as a sufficient number of adjacent nucleotides, to 
provide a primer sequence of sufficient size and sequence complexity to form a stable duplex on 
both sides of the deletion junction being traversed. Typically, a primer of about 17 to 25 
nucleotides in length is preferred, with about 5 to 10 residues on both sides of the junction of 

1 5 the sequence being altered. 

In general, the technique of site-specific mutagenesis is well known in the ait. As will 
be appreciated, the technique typically employs a bacteriophage vector that exists in both a 
single stranded and double stranded form. Typical vectors useful in site-directed mutagenesis 
20 include vectors such as the M 1 3 phage. These phage vectors are commercially available and 
their use is generally well known 10 those skilled in the art. Double stranded plasmids are also 
routinely employed in site directed mutagenesis, which eliminates the step of transferring the 
gene of interest from a phage to a plasmid. 

25 In general, site-directed mutagenesis is performed by first obtaining a single-stranded 

vector, or melting of two strands of a double stranded vector which includes within its sequence 
a DNA sequence encoding the desired protein. An oligonucleotide primer bearing the desired 
mutated sequence is synthetically prepared. This primer is then annealed with the single- 
stranded DNA preparation, and subjected to DNA polymerizing enzymes such as £ coli 

30 polymerase I Klenow fragment, in order to complete the synthesis of the mutation-bearing 
strand. Thus, a heteroduplex is formed wherein one strand encodes the original non-mutated 
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sequence and the second strand bears the desired mutation. This heteroduplex vector is then 
used to transform appropriate cells, such as E. coli cells, and clones are selected that include 
recombinant vectors bearing the mutated sequence arrangement. 

5 The preparation of sequence variants of the selected gene using site-directed 

mutagenesis is provided as a means of producing potentially useful species and is not meant to 
be limiting, as there are other ways in which sequence variants of genes may be obtained. For 
example, recombinant vectors encoding the desired gene may be treated with mutagenic agents, 
such as hydroxylamine, to obtain sequence variants. 

10 

6, PURIFICATION OF LIPOPROTEINS 

The purification of plasma LDL involves obtaining a composition of Lp(a) and 
subjecting the composition to reductive cleavage in a manner that allows the formation of 
cleavage products apo (a) and apoBlOO. These products are then separated to yield purified apo 
15 BIOO. Plasma lipoproteins may be isolated using standard sequential flotation 
ultracentrifugation methods as described (Schumaker and Puppione. 1986). 

a. Purification of Lp(a) 

Lp(a) is known to be made in the liver of primates. The LDL and VLDL in the plasma 
20 represents the primary source for the purification of Lp(a). Plasma may be collected from any 
primate source for the purposes of the invention, or indeed any other source suspected of 
possessing Lp(a). The Lp(a) component of the plasma can then be separated from other 
components of the plasma using ultracentrifugational flotation at a density of 1.21 g/mL for 20 
hours at 50, OOOrpm followed by affmity chromatography using lysine-Sepharose™. Of course, 
25 the ultra centrifugational procedure is only exemplary and those of skill in the art will be able to 
vary them according to the particular equipment and study need without imdue experimentation. 
The plasma may be supplemented with various inhibitors to prevent the Lp(a) from interacting 
with LDL components of the plasma. 

30 Having separated Lp(a) from the other plasma components the Lp(a) sample is purified 

using affmity chromatography lysine-Sepharose™ chromatography. This separation is 
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described in detail in PCT publication WO 97/17371, specifically incorporated herein by 
reference. 

In some cases, it is desirable to use a method other than lysine-Sepharose™ 
S chromatography for the purification of Lp(a), in such instances other chromatographic methods 
such FPLC may be employed. Such methods are disclosed in Scanu et al. 1993, incorporated 
herein by reference, and may be used in conjunction with the present invention to purify apo 
B 100 from Lp(a). 

10 The product purity can be assessed by for example, mobility on. 1% agarose gels. 

Western blots of SDS PAGE, utilizing anti-LDL antibodies. 

b. Isolation of Apo BlOO from Lp (a) 
(i) using centrif ligation 
15 Following the purification of Lp(a), the apoBlOO may be separated from the apo A 

fraction of the Lpa complex using reductive cleavage.. The purified intact Lp(a) protein is 
subjected to reductive cleavage wherein a known volume of Lp(a) is incubated with a reductant. 
Exemplary reductants include homocysteine, N-acetyl cysteine, 2-mercaptoethanol, 3- 
mercaptopropionate, 2-aminoethanol, dithiothreitol. and DTE. 

20 

The reaction is incubated at room temperature for 10-20 minutes. This is followed by the 
addition of an inhibitor to prevent non-covalent, lysine mediated interactions between apo (a) and 
apoBlOO. s-Aminocaproic acid (EACA) may be used as such an inhibitor, substituted by other 
lysine analogues, for example, compounds such as trans 4(amino-methyl)-cyclohexanecarboxylic 

25 acid, N-acetyl-L-lysine, p-benzylamine sulfonic acid, hexylamine, benzamidine, benzylamine, 
L-proline. Of course these are only exemplary lysine analogues and those of skill in the art may 
use other lysine analogues to prevent interaction between apo (a) and apoBlOO proteins. The 
reaction conditions are described in greater detail in PCT publication number WO 97/17371. 
Of course, the conditions for the separation of apo (a) from the reaction mixture using sucrose 

30 density ultracentrifugationis only exemplary, and other methods commonly used by those of skill 
in the art may be used. 
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(ii) Isolation Using Chromatographic Methods 

As an alternative to the above methods for the isolation of apo BlOO from Lp(a) 
chromatographic methods may be utilized as exemplified below. 

5 

Heparin Sepharose™ Chromatography 

Lp(a) may be treated with a reducing agent in the presence of a lysine analogue. For the 
purposes of this invention the lysine analog is supplied to prevent the interaction of apo (a) with 
apoBlOO. The reducing agent is supplied to break the disulfide bond of Lp (a). Lysine analogs 

10 for this invention include but are not limited to compounds such as EACA, trans 4(amino- 
methyl)-cyclohexanecarboxylic acid, N-acetyl-L-lysine, p-benzylamine sulfonic acid, 
hexylamine, benzamidine. benzylamine. L-proline or any other lysine analogue known to the 
artisan skilled in the art may be used. Example of reducing agents that may be used in this 
invention include, but are not limited to, homocysteine, N-acetyl cysteine, 2-mercaptoethanol, 3- 

15 mercaptopropionate, 2-aminoethanoK dithiothreitol, and DTE. 

For example, the mixture of Lp (a), a reducing agent and a lysine analog is incubated for a 
suitable period of time in a suitable buffer of pH 7.4. A heparin-Sepharose™ column is 
equilibrated with a suitable buffer containing the lysine analog and the reducing agent. The 
20 mixture is applied to the equilibrated column, the column is washed with the same buffer and the 
first eluate is collected. 

The first eluate from the column contains the apo (a) dissociated from Lp (a). The "free" 
apo (a) is dialyzed against an appropriate buffer, the dialysis product is pure apo (a) that may be 
25 freeze dried and stored at -20°C or used immediately. The column is further washed with the 
buffer for a total of three column volumes followed by 3 volumes of 2M NaCl in the buffer. The 
high salt concentration serves to dissociate the remaining unreacted Lp(a) and LDL containing 
apoB 1 00 free of apo (a). 
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Lysine-Sepharose'^^ Chromatography 

An alternative to heparin-Sepharose™ chromatography is lysine chromatography. In this 
type of separation, Lp(a) is treated with a suitable reducing agent and then applied to a lysine 
Sepharose™ column that has been equilibrated with a suitable buffer of pH 7.4 containing the 
S reducing agent. The column is washed with the same buffer and the first volume of elute is 
collected. This fraction contains LDL dissociated from apo (a). Further details of this type of 
chromatography for separating apolipoproteins may be found in PCT Publication WO 97/17371. 

7. SCREENING NUCLEIC ACIDS THAT BIND LDL 

10 Specifically contemplated by the present inventors are chip-based DNA technologies 

such as those described by Hacia et al. (1996) and Shoemaker ef al. (1996). Chip technologies 
may be used to present DNA arrays for screening. 

In a first embodiment, chip technologies may be employed to synthesize a variety of 
15 DNAs in order to lest for their binding to an LDL with a specific apoBlOO binding region. 
Briefly, these techniques involve quantitative methods for analyzing large numbers of nucleic 
acids rapidly and accurately. By tagging genes with oligonucleotides or using fixed probe 
arrays, one can employ chip technology to segregate target molecules as high density arrays and 
screen these molecules on the basis of hybridization. See also Pease et al. (1994); Fodor et al. 
20 (1991). 

Thus, the invention may be applied for the screening of nucleic acids that bind to 
apoBlOO containing lipoproteins. The LDL polypeptide or fragment may be either free in 
solution, fixed to a support, expressed in or on the surface of a cell, for example a bacterial cell. 
25 Either the LDL polypeptide or the nucleic acid may be labeled, thereby permitting determining 
of binding to the DNA molecules. 

In another embodiment, the assay may measure the inhibition of binding of LDL to a 
natural or artificial substrate or binding partner. Competitive binding assays can be performed 
30 in which one of the agents (LDL, binding partner or compound) is labeled. Usually, the 
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polypeptide will be the labeled species. One may measure the amount of free label versus 
bound label to determine binding or inhibition of binding. 

Another technique for high throughput screening of compounds is described in WO 
5 84/03564. Large numbers of small test nucleic acids (test compounds) are synthesized on a 
solid substrate, such as plastic pins or some other surface. Similarly, test compounds of the 
present invention are reacted with LDL and washed. Bound polypeptide is detected by various 
methods. 

10 In an alternative embodiment, the invention may be applied for the screening for 

variants of apoBlOO containing lipoproteins to determine a greater or lesser affinity for a 
particular type of nucleic acid. These screening methods would be similar to those described 
above, except that the LDL peptide variants will be presented as an array with the nucleic acid 
binding regions being used to probe the array. Currently, one of the most widely used 

15 approaches for screening polypeptide libraries is to display polypeptides on the surface of 
filamentous bacteriophage (Smith, 1991; Smith, 1992). Ladner et aL, (U.S Patent No 
5,403,484, specifically incorporated herein by reference) reported the display of proteins on the 
outer surface of a chosen bacterial cell, spore or phage, in order to identify and characterize 
binding proteins. 

20 

In an alternative embodiment, purified apoBlOO or DNA-binding fragments thereof can 
be coated directly onto plates for use in the screening techniques. Alternatively, antibodies to 
the polypeptide can be used to immobilize the polypeptide to a solid phase. Also, fusion 
proteins containing a DNA binding region (preferably a terminal region) may be used to link 

25 peptides to a solid phase. Once linked, randomly sheared genomic DNA, transcripts or 
randomly generated oligomers may be contacted with the bound peptides. Any bound nucleic 
acid fragments can be identified by PCR using random primers if they are large enough. In the 
case where random oligomers are used, the oligomers, in addition to the random region, may 
comprise built in primer binding sites that can be used to amplify an intervening random region, 

30 thereby identifying the region binding to apoB 1 00. 
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Thus, using the technologies described herein, it will be possible for one of skill in the 
art to screen for and isolate a variety of nucleic acids that bind to apoBlOO and variants of 
apoBlOO that exhibit nucleic acid binding capacity, including increased or decreased binding as 
compared to wild-type apoB 1 00. 

5 

8. LDL-DNA COMPLEX FORMATION 

In particular aspects of the present invention, lipoproteins are employed in order to 
trasnport DNA into cell in vitro and in vivo. In the present invention, optimal DNA/LDL 
binding has been established. In particular embodiments a 1 : 1 ratio of DNAiLDL protein molar 
10 ratio of 1:1 are incubated at 37 °C for 30 min in a buffered solution. An exemplary buffer may 
be 50 mM Tris-HCl at pH 7.4 containing 150 mM NaCl. and 10 mM MgClj. The 
concentrations of DNA and LDL protein may range form the pmolar range to the jamolar range. 
In a prefened embodiment, 0.39 pmole DNA are incubated with 0.39 pmole LDL-protein. 

15 The incubation conditions may be altered to increase or decrease the efficiency of 

DNA/LDL binding. For example the incubation may occur at temperatures ranging from 4°C 
to SO^'C, thus it is contemplated that the reaction mixture may be incubated at 4**C, 6^*0, 8^*0 
lO^C, 12^C, 14°C I6°C, 18°C, 20^C, 22^C 24^C, 26°C, 28^C, 30^C. 32°C, 34°C, 36°C, 38°C, 
40°C. 42°C, 44°C, 46X, 48°C, 50°C. 

20 

The time of incubation may be varied from as little as 10 minutes to as long as 5 hours. 
Thus it is well within the skill of one in the art to incubate the mixture for varying degrees of 
time. 

25 Other embodiments contemplate varying the concentration of MgC12 in the media. 

Thus the MgCl2 concentration may vary from ImM to 100 mM. Thus, it is contemplated that 
the reaction mixture contains 5mM MgCl2. lOmM MgClz, 12mM MgCl2, 15mM MgCl2, 20mM 
MgCl2, 30mM MgCl2. 35mM MgClj. 40mM MgCl2, 50mM MgCl.. 60mM MgClj, 65mM 
MgCl2. 70mM MgCl2, 80mM MgCl2, 90mM MgClj, or 1 OOmM MgCl2. 

30 
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9. GENE DELIVERY AND EXPRESSION IN EUKARYOTIC CELLS 

The gene delivery system of the instant invention can be used to express any gene of 
interest in eukaryotic cells. The gene or its cDNA sequence is cloned into a plasmid containing 
the specific lipoprotein binding sequences (including, but not limited to SRE, E/C, FAS) and/or 
S any eukaryotic regulatory sequence (for example, but not limited to HCMV, or tyrosine kinase 
promoter region) using DNA cloning techniques well .known to the art. The orientation, 
number and location of the lipoprotein binding sequences may vary widiin the nucleic acid 
vector, but should not interrupt the protein coding sequence of the gene of interest. 

10 The gene delivery system of the instant invention (see FIG. 15) can be used to transfect 

eukaryotic cells either in vivo or in vitro with any expression vector containing one or more of 
the aforementioned lipoprotein binding sequences. Expression vectors are designed using 
recombinant DNA cloning techniques known to the art and generally include five components 
linked in the following 5' to 3' orientation: i) an eukaryotic promoter sequence, 2) a sequence 

15 encoding a 5' untranslated RNA (UTR) which may include a first intron sequence followed by a 
consensus Kozak sequence and an initiation ATG, 3) a protein coding sequence, 4) a 3' UTR, 
and 5) a cognate transcription terminator sequence. 

Lipoproteins are isolated from blood in a manner similar to the previously described 
20 procedures (see. Example 1 ) and bound to the nucleic acids of interest in a manner similar to the 
previously described DNA binding protocol (see, Example 2). Separation of protein-boimd 
DNA from free DNA may be required prior to transfection and can be accomplished by 
adsorption to nitrocellulose membranes or other techniques well known to the art including, but 
not limited to size-exclusion or density ultracentrifugation. 

25 

a) Control Regions 

In order for the gene delivery system of the present invention to eflfect expression of a 
transcript encoding a selected gene, the polynucleotides encoding these genes will be under the 
transcriptional control of a promoter. A "promoter" refers to a DNA sequence recognized by 
30 the synthetic machinery of the host cell, or introduced synthetic machinery, that is required to 
initiate the specific transcription of a gene. The phrase ''under transcripdonal control" means 
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that the promoter is in the correct location in relation to the polynucleotide to control RNA 
. polymerase initiation and expression of the polynucleotide. 

The term promoter will be used here to refer to a group of transcriptional control 
5 modules that are clustered around the initiation site for RNA polymerase II. Much of the 
thinking about how promoters are organized derives from analyses of several viral promoters, 
including those for the HSV thymidine kinase (tk) and SV40 early transcription units. These 
studies, augmented by more recent work, have shown that promoters are composed of discrete 
functional modules, each consisting of approximately 7-20 bp of DNA, and containing one or 
10 more recognition sites for transcriptional activator or repressor proteins. 

At least one module in each promoter functions to position the start site for RNA 
synthesis. The best known example of this is the TATA box, but in some promoters lacking a 
TATA box, such as the promoter for the mammalian terminal deoxynucleotidyl transferase gene 
15 and the promoter for the SV40 late genes, a discrete element overlying the start site itself helps 
to fix the place of initiation. 

Additional promoter elements regulate the frequency of transcriptional initiation. 
Typically, these are located in the region 30-110 bp upstream of the start site, although a 

20 number of promoters have recently been shown to contain functional elements downstream of 
the start site as well. The spacing between promoter elements frequently is flexible, so that 
promoter function is preserved when elements are inverted or moved relative to one another. In 
the tk promoter, the spacing between promoter elements can be increased to 50 bp apart before 
activity begins to decline. Depending on the promoter, it appears that individual elements can 

25 function either cooperatively or independently to activate transcription. 

The particular promoter that is employed to control the expression of a therapeutic gene 
is not believed to be critical, so long as it is capable of expressing the polynucleotide in the 
targeted cell. Thus, where a human cell is targeted, it is preferable to position the 
30 polynucleotide coding region adjacent to and under the control of a promoter that is capable of 
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being expressed in a human cell. Generally speaking, such a promoter might include either a 
human or viral promoter. 

In preferred embodiments, the human cytomegalovirus (CMV) immediate early gene 
5 promoter, the SV40 early promoter and the Rous sarcoma virus long terminal repeat can be 
used to obtain high-level expression of the polynucleotide of interest. The use of other viral or 
mammalian cellular or bacterial phage promoters which are well-known in the art to achieve 
expression of polynucleotides is contemplated as well, provided that the levels of expression are 
sufficient to produce a growth inhibitory effect. 

10 

By employing a promoter with well-known properties, the level and pattern of 
expression of a polynucleotide following transfection can be optimized. For example, selection 
of a promoter which is active in specific cells, such as tyrosinase (melanoma), alpha-fetoprotein 
and albumin (liver tumors), CCIO (lung tumor) and prostate-specific antigen (prostate tumor) 
1 5 will permit tissue-specific expression of the therapeutic gene. 

Enhancers were originally detected as genetic elements that increased transcription from 
a promoter located at a distant position on the same molecule of DNA. This ability to act over a 
large distance had little precedent in classic studies of prokaryotic transcriptional regulation. 
20 Subsequent work showed that regions of DNA with enhancer activity are organized much like 
promoters. That is, they are composed of many individual elements, each of which binds to one 
or more transcriptional proteins. 

The basic distinction between enhancers and promoters is operational. An enhancer 
25 region as a whole must be able to stimulate transcription at a distance; this need not be true of a 
promoter region or its component elements. On the other hand, a promoter must have one or 
more elements that direct initiation of RNA synthesis at a particular site and in a particular 
orientation, whereas enhancers lack these specificities. Promoters and enhancers are often 
overlapping and contiguous, often seeming to have a very similar modular organization. 

30 
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Additionally, any promoter/enhancer combination (as per the Eukaryotic Promoter Data 
Base EPDB) could be used to drive expression of a particular construct. Use of a T3j T7 or SP6 
cytoplasmic expression system is another possible embodiment. Eukar>'otic ceils can support 
cytoplasmic transcription from certain bacteriophage promoters if the appropriate bacteriophage 
S polymerase is provided, either as part of the delivery complex or as an additional genetic 
expression vector. 

According to the present invention, a number of different promoters are required. It is 
contemplated that these promoters may be the same or different, but the selection of particular 
1 0 promoters for particular uses may be advantageous. 

b) IRES 

In certain embodiments of the invention, the use of internal ribosome binding site 
(IRES) elements may prove advantageous in accordance with the present invention. These 

1 5 elements are used to create multigene, or polycistronic, messages. IRES elements are able to 
bypass the ribosome scanning model of S' methylated Cap dependent translation and begin 
translation at internal sites (Pelletier and Sonenberg, 1988). IRES elements from two members 
of the picomavirus family (polio and encephalomyocarditis) have been described (Pelletier and 
Sonenberg, 1988), as well an IRES from a mammalian message (Macejak and Samow, 1991). 

20 IRES elements can be linked to heterologous open reading frames. Multiple open reading 
frames can be transcribed together, each separated by an IRES, creating polycistronic messages. 
By virtue of the IRES element, each open reading frame is accessible to ribosomes for efficient 
translation. Multiple genes can be efficiently expressed using a single promoter/enhancer to 
transcribe a single message. 

25 

Any heterologous open reading frame can be linked to IRES elements. This includes 
genes for secreted proteins, multi-subimit proteins, encoded by independent genes, intracellular 
or membrane-bound proteins and selectable markers. In this way, expression of several proteins 
can be simultaneously engineered into a cell with a single construct and a single selectable 
30 marker. 
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In addition, it may be desirable to include polyadenylation signals in the vectors. These 
signals serve to terminate transcription and to stabilize mRNA transcripts produced from the 
vectors. A preferred polyadenylation signal is an SV40 polyadenylation signal. 

S c) Genes 

The present invention contemplates the use of a variety of different genes inserted into 
the SV40 vector. For example, genes encoding enzymes, hormones, cytokines, oncogenes, 
receptors, tumor suppressors, transcription factors, drug selectable markers, toxins and various 
antigens are contemplated as suitable genes for use according to the present invention. In 
10 addition, antisense constructs derived from oncogenes are other "genes'' of interest according to 
the present invention. 

A common gene currently being used in many gene therapy trials is p53, which 
currently is recognized as a tumor suppressor gene. High levels of mutant p53 have been found 

15 in many cells transformed by chemical carcinogenesis, ultraviolet radiation, and several viruses. 
The p53 gene is a frequent target of mutational inactivation in a wide variety of human tumors 
and is already documented to be the most frequently-mutated gene in common human cancers. 
It is mutated in over 50% of human NSCLC (Hollstein et al, 1991) and in a wide spectrum of 
other tumors. Overexpression of wild-type p53 has been shown in some cases to be anti- 

20 proliferative in human tumor eel! lines. Thus, p53 can act as a negative regulator of cell growth 
(Weinberg, 1991) and may directly suppress uncontrolled cell growth or indirectly activate 
genes that suppress this growth. It has also been reported that transfection of DNA encoding 
wild-type p53 into cancer cell lines restores growth suppression control in such cells (Casey et 
ai, 1991; Takahasi et al, 1992). It is thus proposed that the treatment of p53-associated 

25 cancers with wild type p53 in the compositions of the present invention will reduce the number 
of malignant cells or their growth rate. 

pi 6'^^** belongs to a newly described class of CDK-inhibitory proteins that also includes 
pl6^ and p27'^'^'. The gene maps to 9p2I, a chromosome region frequently 

30 deleted in many tumor types. Homozygous deletions and mutations of the pi 6"^'^'* gene are 
frequent in human tumor cell lines. This evidence suggests that the pie"^^'* gene is a tumor 
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suppressor gene. This interpretation has been challenged, however, by the observation that the 
frequency of the pl6'^'^'^ gene alterations is much lower in primary uncultured tumors than in 
cultured cell lines (Caldas et al,, 1994; Cheng et al, 1994; Hussussian et al, 1994: Kamb et di, 
1994; Kamb et ai, 1994; Mori et aL 1994; Okamoto et al, 1994; Nobori et al, 1995; Orlow et 
5 a/., 1994; Arap et al, 1995). Restoration of wild-type pie"^*^^ function by transfection with a 
plasmid expression vector reduced colony formation by some human cancer cell lines 
(Okamoto, 1994; Arap, 1995). 

Cell adhesion molecules, or CAM's are known to be involved in a complex network of 
10 molecular interactions that regulate organ development and cell differentiation (Edelman, 
1985). Recent data indicate that aberrant expression of CAM's maybe involved in the 
tumorigenesis of several neoplasms: for example, decreased expression of E-cadherin, which is 
predominantly expressed in epithelial cells, is associated with the progression of several kinds 
of neoplasms (Edelman and Crossin, 1991; Frixen et al, 1991; Bussemakers et al, 1992; 
15 Matsura et al, 1992; Umbas et al. 1992). Also, Giancotti and Ruoslahti (1990) demonstrated 
that increasing expression of asP, integrin by gene transfer can reduce tumorigenicity of 
Chinese hamster ovary cells in vivo. C-CAM now has been shown to suppress tumors growth 
in vitro and in vivo. Thus, the compositions of the present invention can be employed to 
mediated C-CAM suppression of tumor cell growth. 

20 

Other tumor suppressors that may be employed according to the present invention 
include RB, APC, DCC, NF-1, NF-2. WT-1, MEN-I, MEN-II, zacl, p73, VHL, MMACl, FCC 
and MCC. Inducers of apoptosis, such as Bax, Bak, Bcl-Xj, Bik, Bid, Harakiri, Ad ElB, Bad 
and ICE-CED3 proteases, similarly could find use according to the present invention. 

25 

Various enzyme genes are of interest according to the present invention. Such enzymes 
include cytosine deaminase, hypoxanthine-guanine phosphoribosyltransferase, galactose- 1- 
phosphate uridyltransferase, phenylalanine hydroxylase, glucocerbrosidase, sphingomyelinase, 
a-L-iduronidase, glucose-6-phosphate dehydrogenase, HSV thymidine kinase and human 
30 thymidine kinase. 
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In another example, the expression vector may include a nucleotide sequence encoding 
for functional apolipoprotein A-I for the prevention or treatment of artherosclerosis. 
Atherosclerosis is a disease that is characterized by the development of atherosclerotic lesions 
which contain cholesterol esters and other lipids that are derived from the blood circulation. 
S The plasma concentration of HDL is inversely correlated v^ith the risk for development of 
atherosclerosis. HDL present in the blood circulation take up free cholesterol from extrahepatic 
cells which through the action of LCAT (lecithin-cholesterol acyltransferase) is converted to 
cholesterol esters and stored in the core of the HDL particles. The HDL cholesterol esters are 
transported either directly or indirectly via transfer to triglyceride rich lipoproteins (/.e., VLDL, 

10 IDL, LDL) to the liver by a process called "reverse cholesterol transport". Reverse cholesterol 
transport is of great importance for maintaining cholesterol homeostasis since the liver is the 
major organ for cholesterol excretion from the body via bile acids. Apo A-I is the major protein 
constituent of HDL and a cofactor LCAT. Therefore, increasing the plasma concentration of 
apo A-I containing HDL can increase the reverse cholesterol transport and reduce the risk for 

15 atherosclerosis. 

Hormones are another group of gene that may be used in the SV40 vectors described 
herein. Included are growth hormone, prolactin, placental lactogen, luteinizing hormone, 
follicle-stimulating hormone, chorionic gonadotropin, thyroid-stimulating hormone, leptin, 

20 adrenocorticotropin (ACTH), angiotensin I and II, p-endorphin, p-melanocyte stimulating 
hormone (P-MSH), cholecystokinin, endothelin I, galanin, gastric inhibitory peptide (GIP), 
glucagon, insulin, lipotropins, neurophysins, somatostatin, calcitonin, calcitonin gene related 
peptide (CGRP). p-calcitonin gene related peptide, hypercalcemia of malignancy factor (1-40), 
parathyroid hormone-related protein (107-139) (PTH-rP), parathyroid hormone-related protein 

25 (107-1 1 1) (PTH-rP), glucagon-like peptide (GLP-1), pancreastatin, pancreatic peptide, peptide 
YY, PHM, secretin, vasoactive intestinal peptide (VIP), oxytocin, vasopressin (AVP), 
vasotocin, enkephalinamide, metorphinamide, alpha melanocyte stimulating hormone (alpha- 
MSH), atrial natriuretic factor (5-28) (ANF), amylin, amyloid P component (SAP-1), 
corticotropin releasing hormone (CRH), growth hormone releasing factor (GHRH), luteinizing 

30 hormone-releasing hormone (LHRH), neuropeptide Y, substance K (neurokinin A ), substance 
P and thyrotropin releasing hormone (TRH). 
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Other classes of genes that are contemplated to be inserted into the SV40 vectors of the 
present invention include interleukins and cytokines. Interleukin 1 (IL-1), IL-2, IL-3, IL-4, IL- 
5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-1 1 IL-12, GM-CSF and G-CSF. 

5 

Other therapeutics genes might include genes encoding antigens such as viral antigens, 
bacterial antigens, fimgal antigens or parasitic antigens. Viruses include picomavirus, 
coronavirus, togavirus, flavirviru, rhabdovirus, paramyxovirus, orthomyxovirus, bunyavirus, 
arenvirus, reovirus, retrovirus, papovavirus, parvovirus, herpesvirus, poxvirus, hepadnavirus, 

10 and spongiform virus. Preferred viral targets include influenza, herpes simplex virus 1 and 2, 
measles, small pox, polio or HIV. Pathogens include trypanosomes. tapeworms, roundworms, 
helminths, . Also, tumor markers, such as fetal antigen or prostate specific antigen, may be 
targeted in this manner. Preferred examples include HIV env proteins and hepatitis B surface 
antigen. Administration of a vector according to the present invention for vaccination purposes 

IS would require that the vector-associated antigens be sufficiently non-immunogenic to enable 
long term expression of the transgene, for which a strong immune response would be desired. 
Preferably, vaccination of an individual would only be required infrequently, such as yearly or 
biennially, and provide long term immunologic protection against the infectious agent. 

20 In yet another embodiment, the heterologous gene may include a single-chain antibody. 

Methods for the production of single-chain antibodies are well known to those of skill in the art. 
The skilled artisan is referred to U.S. Patent No. 5,359,046, (incorporated herein by reference) 
for such methods. A single chain antibody is created by fusing together the variable domains of 
the heavy and light chains using a short peptide linker, thereby reconstituting an antigen binding 

25 site on a single molecule. 

Single-chain antibody variable fragments (Fvs) in which the C-terminus of one variable 
domain is tethered to the N-terminus of the other via a 15 to 25 amino acid peptide or linker, 
have been developed without significantly disrupting antigen binding or specificity of the 
30 binding (Bedzyk et al., 1990; Chaudhary et aL, 1990). These Fvs lack the constant regions (Fc) 
present in the heavy and light chains of the native antibody. 
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Antibodies to a wide variety of molecules are contemplated, such as oncogenes, toxins, 
hormones, enzymes, viral or bacterial antigens, transcription factors or receptors. 

S d. Antisense 

The instant invention can be used to transfect eukaryotic ceils with ribonucleotide 
sequences including anti-sense RNA and ribozymes, that function to inhibit the translation of 
any mRNA of interest, either by direct binding (to the mRNA of interest), or blocking 
deoxyribonucleic acid (DNA) coding sequences preventing transcription. 

10 

Anti-sense RNA inhibits the translation of mRNA by direct binding to the mRNA of 
interest and preventing protein translation, either by inhibition of ribosome binding or the 
translocation of the targeted mRNA molecule which then becomes more susceptible to nuclease 
degradation. 

15 

Antisense methodology takes advantage of the fact that nucleic acids tend to pair with 
"complementar>" sequences. By complementary, it is meant that polynucleotides are those 
which are capable of base-pairing according to the standard Watson*Crick complementarity 
rules. That is. the larger purines will base pair with the smaller pyrimidines to form 

20 combinations of guanine paired with cytosine (G:C) and adenine paired with either thymine 
(A:T) in the case of DNA, or adenine paired with uracil (A:U) in the case of RNA. Inclusion of 
less common bases such as inosine, 5-methylcytosine, 6-methyladenine, hypoxanthine and 
others in hybridizing sequences does not interfere with pairing. Oncogenes such as ras, myc, 
neu, raf, erb, src, fins, jun, trk, ret, gsp, hsU bcl and abl also are suitable targets for antisense 

25 constructs. 

Targeting double-stranded (ds) DNA with polynucleotides leads to triple-helix 
formation; targeting RNA will lead to double-helix formation. Antisense polynucleotides, 
when introduced into a target cell, specifically bind to their target polynucleotide and interfere 
30 with transcription, RNA processing, transport, translation and/or stability. Antisense RNA 
constructs, or DNA encoding such antisense PlNA's, may be employed to inhibit gene 
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transcription or translation or both within a host cell, either in vitro or in vivo, such as within a 
host animal, including a human subject. 

Antisense constructs may be designed to bind to the promoter and other control regions, 
5 exons, introns or even exon-intron boundaries of a gene. It is contemplated that the most 
effective antisense constructs will include regions complementary to intron/exon splice 
junctions. Thus, it is proposed that a preferred embodiment includes an antisense construct with 
complementarity to regions within 50-200 bases of an intron-exon splice junction. It has been 
observed that some exon sequences can be included in the construct without seriously affecting 
10 the target selectivity thereof. The amount of exonic material included will vary depending on 
the particular exon and intron sequences used. One can readily test whether too much exon 
DNA is included simply by testing the constructs in vitro to determine whether normal cellular 
function is affected or whether the expression of related genes having complementary sequences 
is affected. 

15 

As stated above, "complementary" or "antisense" means polynucleotide sequences that 
are substantially complementary over their entire length and have very few base mismatches. 
For example, sequences of fifteen bases in length may be termed complementary when they 
have complementary nucleotides at thirteen or fourteen positions. Naturally, sequences which 

20 are completely complementary will be sequences which are entirely complementary throughout 
their entire length and have no base mismatches. Other sequences with lower degrees of 
homology also are contemplated. For example, an antisense construct which has limited 
regions of high homology, but also contains a non-homologous region (e.g., ribozyme) could be 
designed. These molecules, though having less than 50% homology, would bind to target 

25 sequences under appropriate conditions. 

It may be advantageous to combine portions of genomic DNA with cDNA or synthetic 
sequences to generate specific constructs. For example, where an intron is desired in the 
ultimate construct, a genomic clone will need to be used. The cDNA or a synthesized 
30 polynucleotide may provide more convenient restriction sites for the remaining portion of the 
construct and, therefore, would be used for the rest of the sequence. 
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e. Ribozymes 

Ribozymes are RNA molecules that catalyze the specific cleavage of RNA. Ribozyihe 
activity is mediated through the hybridization of the ribozyme molecule to a specific sequence 
5 in the target RNA, followed by the endonucleolytic cleavage of the target RNA within that 
sequence. Potential RNA cleavage sites can be identified by searching for specific 
ribonucleotide sequences that include sequences such as GUU, GUC, and QUA within the 
target RNA. Hammerhead motif ribozyme molecules can then be designed that contain short 
RNA sequences (15-25 ribonucleotides) that are complementary to the region including the 
1 0 cleavage site of the target RNA, 

Ribozymes are RNA-protein complexes that cleave nucleic acids in a site-specific 
fashion. Ribozymes have specific catalytic domains that possess endonuclease activity (Kim 
and Cook, 1987; Geriach et ai, 1987; Forster and Symons, 1987). For example, a large number 
15 of ribozymes accelerate phosphoester transfer reactions with a high degree of specificity, often 
cleaving only one of several phosphoesters in an oligonucleotide substrate (Cook et aiy 1981; 
Michel and Westhof, 1990; Reinhold-Hurek and Shub, 1992). This specificity has been 
attributed to the requirement that the substrate bind via specific base-pairing interactions to the 
internal guide sequence ("IGS'') of the ribozyme prior to chemical reaction. 

20 

Ribozyme catalysis has primarily been observed as part of sequence-specific 
cleavage/ligation reactions involving nucleic acids (Joyce, 1989; Cook et al, 1981). For 
example, U.S. Patent No. 5,354,855 reports that certain ribozymes can act as endonucleases 
with a sequence specificity greater than that of known ribonucleases and approaching that of the 

25 DNA restriction enzymes. Thus, sequence-specific ribozyme-mediated inhibition of gene 
expression may be particularly suited to therapeutic applications (Scanlon et a/., 1991; Sarver et 
al, 1990). Recently, it was reported that ribozymes elicited genetic changes in some cells lines 
to which they were applied; the altered genes included the oncogenes H-ras, c-fos and genes of 
HIV. Most of this work involved the modification of a target mRNA, based on a specific 

30 mutant codon that is cleaved by a specific ribozyme. 
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Since the secondary structure of both target RNA as well as the anti-sense RNA is of 
great importance for the hybridization of both molecules, the predicted structural features can 
be analyzed and RNase protection assays can be used to determine hybridization efficiency. 
Anti-sense RNA and ribozymes can be synthesized employing chemical nucleic acid synthesis 
5 techniques well known to the art (/.e., solid phase phosphoromidite synthesis) or the RNA 
molecules can be produced by in vitro and in vivo transcription of DNA sequences encoding the 
antisense RNA. DNA sequences encoding ribozymes or anti-sense RNA may be incorporated 
into an expression vector. The expression vector may be prebound to purified plasma 
lipoprotein fractions prior to transfection into eukaryotic cells. 

10 

f. Self-initiating and self-sustaining gene expression systems 

The invention gene delivery system can also be used to delivery self-initiating and self- 
sustaining gene expression systems. Self-initiating and self-sustaining gene expression systems 
may be constructed by binding a RNA polymerase to a DNA construct in vitro prior to the 
15 introduction of the polynucleotide into the cell as described by Wagner et ai. (U.S. Patent No. 
#5,591,601). The RNA polymerase is bound to a DNA construct containing a cognate promoter 
of the RNA polymerase operably linked to a DNA sequence encoding for the RNA polymerase. 

The expression of functional RNA polymerase in turn enables the expression of any 
20 gene of interest that contains a cognate promoter sequence recognized by the same RNA 
polymerase in eukaryotic host cells. DNA sequences encoding for both RNA polymerase and 
gene product of interest {i.e., protein of interest) may be contained within the same gene 
expression system. The gene expression system may be prebound to purified plasma 
lipoprotein fractions prior to transfection into eukaryotic cells. 

25 

g. Delivery ofDNA to cells iifviVo 

The invention gene delivery system can also be used to deliver DNA to cells in vivo. An 
expression vector containing the polynucleotide sequences of the gene of interest (e.g., reporter 
gene or a healthy copy of a defective gene) is prebound to LDL according to the protocols 
30 described herein. This DNA-LDL complex is then introduce into an organism for example, a 
rat, mouse or human by, for example, intravenous injection. At varying times post-injection, 
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LDL is isolated from the blood and probed for DNA sequences of the type that were prebound 
to the LDL using standard molecular biological techniques such as. but not limited to, Southern 
blot hybridization or PGR™. 

S The LDL also can be inununoprecipitated with anti-LDL antibodies and then probed for 

specific DNA sequences bound to it. In order to determine cellular internalization and/or 
integration of the reporter gene sequences into the genomic DNA of cells of different tissues, 
total genomic DNA can be isolated from various tissues (according to standard molecular 
biology techniques) and probed for the presence of the reporter gene sequences using specific 
10 polynucleotide probes in PCR"^^ or Southern blot hybridization techniques. In addition, total 
cellular RNA can be isolated from various different tissues using standard molecular biology 
techniques and probed for the presence of specific mRNA encoded for by the reporter gene 
polynucleotide sequences using specific antisense polynucleotide probes in Northern blot 
hybridization techniques or ribonuclease (RNase) protection assays. 

15 

Expression of a functional protein encoded for by the gene of interest in different tissues 
can be analyzed using techniques well known to the art, such as. Western blot hybridization of 
cellular protein extracts with antibodies that bind specifically to the reporter gene product (i.e., 
protein of interest) or direct detection of intracellular fluorescence (e.g., when reporter genes are 
20 used that encode for blue or green fluorescent proteins (e.g., GFP from Clontech Inc.). 

Several non-viral methods for the transfer of a DNA-LDL complex of the present 
invention into cultured mammalian cells also are contemplated by the present invention. These 
include calcium phosphate precipitation (Graham and Van Der Eb, 1973; Chen and Okayama, 

25 1987; Rippe et aL, 1990) DEAE-dextran (Gopal, 1985), electroporation (Tur-Kaspa et aL 1986; 
Potter et aL 1984), direct microinjection (Harland and Weintraub, 1985), DNA-loaded 
liposomes (Nicolau and Sene, 1982; Fraley et aL, 1979) and lipofectamine-DNA complexes, 
cell sonication (Fechheimer et aL, 1987), gene bombardment using high velocity 
microprojectiles (Yang et ai, 1990), and receptor-mediated transfection (Wu and Wu, 1987; 

30 Wu and Wu, 1988). Some of these techniques may be successfully adapted for in vivo or ex 
vivo use. 
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Once the DNA-LDL complex has been delivered into the ceil, the nucleic acid encoding 
the gene of interest may be positioned and expressed at different sites. In certain embodiments, 
the nucleic acid encoding the gene may be stably integrated into the genome of the cell. This 
5 integration may be in the cognate location and orientation via homologous recombination (gene 
replacement) or it may be integrated in a random, non-specific location (gene augmentation). In 
yet further embodiments, the nucleic acid may be stably maintained in the cell as a separate, 
episomal segment of DNA. Such nucleic acid segments or "episomes" encode sequences 
sufficient to permit maintenance and replication independent of or in synchronization with the 
10 host cell cycle. How the DNA-LDL complex is delivered to a cell and where in the cell the 
nucleic acid remains is dependent on the type of DNA molecule bound to the LDL. 

In one embodiment of the invention, the DNA-LDL complex may simply consist of 
naked recombinant DNA or plasmids. Transfer of the construct may be performed by any of 

15 the methods mentioned above which physically or chemically permeabilize the cell membrane. 
This is particularly applicable for transfer in vitro but it may be applied to in vivo use as well. 
Dubensky et ai. (1984) successfully injected polyomavirus DNA in the form of calcium 
phosphate precipitates into liver and spleen of adult and newborn mice demonstrating active 
viral replication and acute infection. Benvenisty and Neshif (1986) also demonstrated that 

20 direct intraperitoneal injection of calcium phosphate-precipitated plasmids results in expression 
of the transfected genes. It is envisioned that DNA encoding a gene of interest may also be 
transferred in a similar manner in vivo and express the gene product. 

Another embodiment of the invention for transferring a naked DNA-LDL complex into 
25 cells may involve panicle bombardment. This method depends on the ability to accelerate 
DNA-coated microprojectiles to a high velocity allowing them to pierce cell membranes and 
enter cells without killing them (Klein et aL, 1987). Several devices for accelerating small 
particles have been developed. One such device relies on a high voltage discharge to generate 
an electrical current, which in turn provides the motive force (Yang et ai, 1990). The 
30 microprojectiles used have consisted of biologically inert substances such as tungsten or gold 
beads. 
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Selected organs including the liver, skin, and muscle tissue of rats and mice have been 
bombarded in vivo (Yang et ai, 1990; Zelenin et aL 1991 ). This may require surgical exposure 
of the tissue or cells, to eliminate any intervening tissue between the gun and the target organ, 
5 /.e., ex vivo treatment. Again, DNA encoding a particular gene may be delivered via this 
method and still be incorporated by the present invention. 

In a further embodiment of the invention, the DNA-LDL complex may be entrapped in a 
liposome. Liposomes are vesicular structures characterized by a phospholipid bilayer 
10 membrane and an inner aqueous medium. Multileunellar liposomes have multiple lipid layers 
separated by aqueous medium. They form spontaneously when phospholipids are suspended in 
an excess of aqueous solution. The lipid components undergo self-rearrangement before the 
formation of closed structures and entrap water and dissolved solutes between the lipid bilayers 
(Ghosh and Bachhawat, 1991). Also contemplated are lipofectamine-DNA complexes. 

15 

Liposome-mediated nucleic acid delivery and expression of foreign DNA in vitro has 
been very successful. Wong et aL (1980) demonstrated the feasibility of liposome-mediated 
delivery and expression of foreign DNA in cultured chick embryo, HeLa and hepatoma cells. 
Nicolau et ai., (1987) accomplished successful liposome-mediated gene transfer in rats after 
20 intravenous injection. 

In certain embodiments of the invention, the liposome may be complexed with a 
hemagglutinating virus (HVJ). This has been shown to facilitate fusion with the cell membrane 
and promote cell entry of liposome-encapsulated DNA (Kaneda et aL, 1989). In other 

25 embodiments, the liposome may be complexed or employed in conjunction with nuclear non- 
histone chromosomal proteins (HMG-1) (Kato et al„ 1991). In yet fiirther embodiments, the 
liposome may be complexed or employed in conjunction with both HVJ and HMG-1. In that 
such expression constructs have been successfully employed in transfer and expression of 
nucleic acid in vitro and in vivo, then they are applicable for the present invention. Where a 

30 bacterial promoter is employed in the DNA construct, it also will be desirable to include within 
the liposome an appropriate bacterial polymerase. 
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Other DNA-LDL complexes which can be employed to deliver a nucleic acid encoding 
a particular gene into cells are receptor-mediated delivery vehicles. These take advantage of the 
selective uptake of macromolecules by receptor-mediated endocytosis in almost all eukaryotic 
5 cells. Because of the cell type-specific distribution of various receptors, the delivery can be 
highly specific (Wu and Wu, 1993). 

Receptor-mediated gene targeting vehicles generally consist of two components: a cell 
receptor-specific ligand and a DNA-binding agent. Several ligands have been used for receptor- 
10 mediated gene transfer. The most extensively characterized ligands are asialoorosomucoid 
(ASOR) (Wu and Wu, 1987) and transferrin (Wagner et ai, 1990). Recently, a synthetic 
neoglycoprotein, which recognizes the same receptor as ASOR, has been used as a gene 
delivery vehicle (Ferkol et ai, 1993; Perales et qL 1994) and epidermal growth factor (EOF) 
has also been used to deliver genes to squamous carcinoma cells (Myers, EPO 0273085). 

15 

In other embodiments, the delivery vehicle may comprise a ligand and a liposome. For 
example, Nicolau et aL (1987) employed lactosyl-ceramide. a galactose-terminal 
asialganglioside. incorporated into liposomes and observed an increase in the uptake of the 
insulin gene by hepatocytes. Thus, it is feasible that a nucleic acid encoding a particular gene 

20 also may be specifically delivered into a cell type such as lung, epithelial or tumor cells, by any 
number of receptor-ligand systems with or without liposomes. For example, epidermal growth 
factor (EOF) may be used as the receptor for mediated delivery of a nucleic acid encoding a 
gene in many tumor cells that exhibit upregulation of EOF receptor. Mannose can be used to 
target the mannose receptor on liver cells. Also, antibodies to CDS (CLL), CD22 (lymphoma), 

25 CD25 (T-cell leukemia) and MAA (melanoma) can similarly be used as targeting moieties. 

In certain embodiments, gene transfer may more easily be performed under ex vivo 
conditions. Ex vivo gene therapy refers to the isolation of cells from an animal, the delivery of a 
nucleic acid into the cells in vitro, and then the return of the modified cells back into an animal. 
30 This may involve the surgical removal of tissue/organs from an animal or the primary culture of 
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cells and tissues. Anderson et aL U.S. Patent 5,399,346, and incorporated herein in its 
entirety, disclose ex vivo therapeutic methods. 

10. PHARMACEUTICAL 

5 The gene delivery system of the instant invention can be administered in vivo in various 

ways including, but not limited to, intravenous, pharyngeal, epidermal, intramuscular, 
intraperitoneal (IP), nasal, and/or rectal. The gene delivery system of the instant invention can 
also be used for in vitro transfections of eukaryotic cell types which possess specific lipoprotein 
receptors on their cytoplasmic membranes, but is not limited to these cell types. 

10 

Pharmaceutical products that may spring from the current invention may comprise 
naked polynucleotide containing single or multiple copies of the specific nucleotide sequences 
that bind to specific DNA-binding sites of the apolipoproteins present on plasma lipoproteins as 
described in the current invention. The polynucleotide may encode a biologically active 
15 peptide, antisense RNA. or ribozyme and will be provided in a physiologically acceptable 
administrable form. 

Another pharmaceutical product that may spring from the current invention may 
comprise a highly purified plasma lipoprotein fraction, isolated according to the methodology, 
20 described herein from either the patients blood or other source, and a polynucleotide containing 
single or multiple copies of the specific nucleotide sequences that bind to specific DNA-binding 
sites of the apolipoproteins present on plasma lipoproteins, prebound to the purified lipoprotein 
fraction in a physiologically acceptable, administrable form. 

25 Yet another pharmaceutical product may comprise a highly purified plasma lipoprotein 

fraction which contains recombinant apolipoprotein fragments containing single or multiple 
copies of specific DNA-binding motifs, prebound to a polynucleotide containing single or 
multiple copies of the specific nucleotide sequences, in a physiologically acceptable 
administrable form. Yet another pharmaceutical product may comprise a highly purified 

30 plasma lipoprotein fraction which contains recombinant apolipoprotein fragments containing 
single or multiple copies of specific DNA-binding motifs, prebound to a polynucleotide 
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containing single or multiple copies of the specific nucleotide sequences, in a physiologically 
acceptable administrable form. 

The dosage to be administered depends to a great extent on the body weight and 
5 physical condition of the subject being treated as well as the route of administration and 
frequency of treatment. A pharmaceutical composition comprising the naked polynucleotide 
prebound to a highly purified lipoprotein fraction may be administered in amounts ranging from 
1 |ig to 1 mg polynucleotide and 1 jig to 100 mg protein. 

10 Administration of the therapeutic virus particle to a patient will follow general protocols 

for the administration of chemotherapeutics, taking into account the toxicity, if any, of the 
vector. It is anticipated that the treatment cycles would be repeated as necessary. It also is 
contemplated that various standard therapies, as well as surgical intervention, may be applied in 
combination with the described gene therapy. 

15 

Where clinical application of a gene therapy is contemplated, it will be necessary to 
prepare the complex as a pharmaceutical composition appropriate for the intended application. 
Generally this will entail preparing a pharmaceutical composition that is essentially free of 
pyrogens, as well as any other impurities that could be harmful to humans or animals. One also 
20 will generally desire to employ appropriate salts and buffers to render the complex stable and 
allow for complex uptake by target cells. 

Aqueous compositions of the present invention comprise an effective amount of the 
compound, dissolved or dispersed in a pharmaceutically acceptable carrier or aqueous medium. 

25 Such compositions can also be referred to as inocula. The phrases "pharmaceutically or 
pharmacologically acceptable'' refer to molecular entities and compositions that do not produce 
an adverse, allergic or other untoward reaction when administered to an animal, or a human, as 
appropriate. As used herein, "pharmaceutically acceptable carrier" includes any and all 
solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption 

30 delaying agents and the like. The use of such media and agents for pharmaceutical active 
substances is well known in the art. Except insofar as any conventional media or agent is 
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incompatible with the active ingredient, its use in the therapeutic compositions is contemplated. 
Supplementary active ingredients also can be incorporated into the compositions. 

The compositions of the present invention may include classic pharmaceutical 
S preparations. Dispersions also can be prepared in glycerol, liquid polyethylene glycols, and 
mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations 
contain a preservative to prevent the growth of microorganisms. 

i) Disease States 

10 A wide variety of disease states may be treated with compositions according to the 

present invention. In essence, any disease that can be treated by provision of a protein or 
nucleic acid is amenable to this approach. Disease states include a variety of genetic 
abnormalities such as diabetes, cancer, cystic fibrosis and various other diseases that could be 
treated by increasing or decreasing expression of a protein in a target cell. 

15 

Depending on the particular disease to be treated, administration of therapeutic 
compositions according to the present invention will be via any common route so long as the 
target tissue is available via that route. This includes oral, nasal, buccal, rectal, vaginal or 
topical. Topical administration would be particularly advantageous for treatment of skin 
20 cancers. Alternatively, administration will be by orthotopic, intradermal, subcutaneous, 
intramuscular, intraperitoneal or intravenous injection. Such compositions would normally be 
administered as pharmaceutically acceptable compositions that include physiologically 
acceptable carriers, buffers or other excipients. 

25 In certain embodiments, ex vivo therapies also are contemplated. Ex vivo therapies 

involve the removal, from a patient, of target cells. The cells are treated outside the patient's 
body and then returned. One example of ex vivo therapy would involve a variation of 
autologous bone marrow transplant. Many times, ABMT fails because some cancer cells are 
present in the withdrawn bone marrow, and return of the bone marrow to the treated patient 

30 results in repopulation of the patient with cancer cells. In one embodiment, however, the 
withdrawn bone marrow cells could be treated while outside the patient with an LDL-DNA 
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particle that targets and kills the cancer cell. Once the bone marrow cells are "purged," they can 
be reintroduced into the patient. 

The treatments may include various ''unit doses/' Unit dose is defmed as containing a 
5 predetermined-quantity of the therapeutic composition calculated to produce the desired 
responses in association with its administration, i.e., the appropriate route and treatment 
regimen. The quantity to be administered, and the particular route and formulation, are within 
the skill of those in the clinical arts. Also of import is the subject to be treated, in particular, the 
state of the subject and the protection desired. A unit dose need not be administered as a single 

10 injection but may comprise continuous infusion over a set period of time. Unit dose of the 
present invention may conveniently may be described in terms of 0.0 Img DNA/kg body weight 
to 0.4mg DNA/kg body weight, with ranges in between these being contemplated such that 
0.05, 0.10, 0.15, 0.20, 0.25, 0.5mg/DNA/kg body weight are administered. Likewise the 
amount of LDL delivered can vary from about 0.2 to about 8.0 mg/kg body weight. Thus in 

15 particular embodiments, 0.4 mg, 0.5 mg, 0.8 mg, 1.0 mg, 1.5 mg, 2.0 mg. 2.5 mg, 3.0 mg, 4,0 
mg, 5.0 mg, 5.5 mg, 6.0 mg, 6.5 mg, 7.0 mg and 7.5 mg of LDL may be delivered to an 
individual in vivo. The dosage of DNAiLDL to be administered depends to a great extent on 
the weight and physical condition of the subject being treated as well as the route of 
administration and the frequency of treatment. A pharmaceutical composition comprising the 

20 naked polynucleotide prebound to a highly purified lipoprotein fraction may be administered in 
amounts ranging from 1 |ig to 1 mg polynucleotide to 1 ^g to 1 OOmg protein. Thus, particular 
compositions may comprise l^ig, 5iag, 10|ig, 20|ig, 30|ig, 40^g, 50|ig, 60^g, 70^g, 80|ig, 
100^g, ISO^ig, 200ng, 250^g, 500|ig, 600^g, 700ng, SOOng, 900ng or 1000|ag polynucleotide 
that is bound independently to l^g, 5|ig, lO^ig, 20|ig, 3.0|ig, 40fig 50|ig, 60|ig, 70jxg, SO^ig, 

25 lOOng, 150|ag, lOOixg, 250^g, 500^g, 600jig, 700vig, 800^g, 900^g or lOOO^ig, 1.5mg, 5 mg, 
10 mg, 20mg, 30mg, 40mg, 50mg, 60 mg, 70mg, 80 mg, 90 mg or lOOmg lipoprotein. Any 
amount of polynucleotide may be bound to any other amount of lipoprotein to achieve the 
pharmaceutical concentrations of the present invention. 
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ii) Cancer 

One of the preferred embodiments of the present invention involves the use of the LDL 
vectors to deliver therapeutic genes to cancer cells. Target cancer cells include cancers of the 
lung, brain, prostate, kidney, liver, ovary, breast, skin, stomach, esophagus, head & neck, 
5 testicles, colon, cervix, lymphatic system and blood. Of particular interest are non-small cell 
lung carcinomas including squamous cell carcinomas, adenocarcinomas and large cell 
undifferentiated carcinomas. 

According to the present invention, one may treat the cancer by directly injection a 
10 tumor with the LDL vector. Alternatively, the tumor may be infused or perfused with the 
vector using an>- suitable delivery vehicle. Local or regional administration, with respect to the 
tumor, also is contemplated. Finally, systemic administration may be performed. Continuous 
administration also may be applied where appropriate, for example, where a tumor is excised 
and the tumor bed is treated to eliminate residual, microscopic disease. Delivery via syringe or 
15 catherization is preferred. Such continuous perfusion may take place for a period from about 1- 
2 hours, to about 2-6 hours, to about 6-12 hours, to about 12-24 hours, to about 1-2 days, to 
about 1-2 weeks or longer following the initiation of treatment. Generally, the dose of the 
therapeutic composition via continuous perfusion will be equivalent to that given by a single or 
multiple injections, adjusted over a period of time during which the perfusion occurs. 

20 

For tumors of > 4 cm, the volume to be administered will be about 4-10 ml (preferably 
10 ml), while for tumors of < 4 cm, a volume of about 1-3 ml will be used (preferably 3 ml). 
Multiple injections delivered as single dose comprise about 0.1 to about 0.5 ml volumes. The 
LDL-DNA particles may advantageously be contacted by administering multiple injections to 
25 the tumor, spaced at approximately I cm intervals. 

In certain embodiments, the tumor being treated may not, at least initially, be resectable. 
Treatments with therapeutic constructs may increase the resectability of the tumor due to 
shrinkage at the margins or by elimination of certain particularly invasive portions. Following 
30 treatments, resection may be possible. Additional treatments subsequent to resection will serve 
to eliminate microscopic residual disease at the tumor site. 
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A typical course of treatment, for a primary tumor or a post-excision tumor bed, will 
involve multiple doses. Typical primary tumor treatment involves a 6 dose application over a 
two week period. The two week regimen may be repeated one, two, three, four, five, six or 
5 more times. During a course of treatment, the need to complete the planned dosings may be 
reevaluated. 

Cancer therapies also include a variety of combination therapies with both chemical and 
radiation based treatments. Combination chemotherapies include, for example, cisplatin 
10 (CDDP), carboplatin, procarbazine, mechlorethamine. cyclophosphamide, ifosfamide, 
melphalan, chlorambucil, bisulfan. nitrosurea. dactinomycin. daunorubicin, doxorubicin, 
bleomycin, plicomycin. mitomycin, etoposide (VP16), tamoxifen, taxol. transplatinum, 5- 
fluorouracil, vincristin. vinblastin and methotrexate. 

15 Combination radiation therapies may be x- and y-irradiation. Dosage ranges for x- 

irradiation range from daily doses of 2000 to 6000 roentgens for prolonged periods of time (3 to 
4 weeks), to single doses of 2000 to 6000 roentgens. Dosages for radioisotopes vary widely, 
and depend on the half-life of the isotope, the strength and type of radiation emitted, and the 
uptake by neoplastic cells. 

20 

Various combinations may be employed, gene therapy is "A" and the radio- or 
chemotherapeutic agent is "B": 

A/B/A B/A/B B/B/A A/A/B A/B/B B/A/A A/B/B/B B/A/B/B 

25 

B/B/B/A B/B/A/B A/A/B/B A/B/A/B A/B/B/A B/B/A/A 

B/A/B/A B/A/A/B A/A/A/B B/A/A/A A/B/A/A A/A/B/A 

30 The terms "contacted" and "exposed," when applied to a cell, are used herein to describe 

the process by which a therapeutic construct and a chemotherapeutic or radiotherapeutic agent 
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are delivered to a target cell or are placed in direct juxtaposition with the target cell. To achieve 
cell killing or stasis, both agents are delivered to a cell in a combined amount effective to kill 
the cell or prevent it from dividing. 

5 The therapeutic compositions of the present invention are advantageously administered 

in the form of injectable compositions either as liquid solutions or suspensions; solid forms 
suitable for solution in, or suspension in, liquid prior to injection may also be prepared. These 
preparations also may be emulsified. A typical composition for such purpose comprises a 
pharmaceutically acceptable carrier. For instance, the composition may contain 10 mg, 25 mg, 
10 50 mg or up to about 100 mg of human serum albumin per milliliter of phosphate buffered 
saline. 

Other pharmaceutically acceptable carriers include aqueous solutions, non-toxic 
excipients, including salts, preservatives, buffers and the like. Examples of non-aqueous 

15 solvents are propylene glycol, polyethylene glycol, vegetable oil and injectable organic esters 
such as ethyloleate. Aqueous carriers include water, alcoholic/aqueous solutions, saline 
solutions, parenteral vehicles such as sodium chloride. Ringer's dextrose, etc. Intravenous 
vehicles include fluid and nutrient replenishers. Preservatives include antimicrobial agents, 
anti-oxidants. chelating agents and inert gases. The pH and exact concentration of the various 

20 components the pharmaceutical composition are adjusted according to well known parameters. 

Additional formulations are suitable for oral administration. Oral formulations include 
such typical excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, 
magnesium stearate, sodium saccharine, cellulose, magnesium carbonate and the like. The 
25 compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release 
fonmulations or powders. When the route is topical, the form may be a cream, ointment, salve 
or spray. 

11. EXAMPLES 

30 The following examples are included to demonstrate preferred embodiments of the 

invention. It should be appreciated by those of skill in the art that the techniques disclosed in 
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the examples which follow represent techniques discovered by the inventor to function well in 
the practice of the invention, and thus can be considered to constitute preferred modes for its 
practice. However, those of skill in the art should, in light of the present disclosure, appreciate 
that many changes can be made in the specific embodiments which are disclosed and still obtain 
5 a like or similar result without departing from the spirit and scope of the invention. 

EXAMPLE 1 
MATERIALS AND METHODS 
L Isolation of Plasma Lipoproteins 

10 Restriction endonucleases were purchased from Life Technologies, and Protease 

inhibitors {i.e., leupeptin, PMSF, and Trasylol) were purchased from Sigma Chemical 
Company. Plasma lipoproteins were isolated using standard sequential flotation 
ultracentrifugation methods as described (Schumaker and Puppione, 1986). Throughout the 
entire procedure samples were kept on ice or at 4°C unless otherwise stated. 

15 

Subjects were fasted for at least 4 h prior to the start of the experimental procedures. 
Blood was drawn into sterile, vacuumed glass tubes containing anticoagulants, e.g., 0.1% 
(ethylenedinitrolo)-tetracetic acid (EDTA) or heparin. Plasma was obtained by centrifugation 
(10 minutes at 3000 x g) and immediately adjusted to 0.005% phenylmethansulfonyl fluoride 

20 (PMSF). lOKIU Trasylol/mU and 1 \xg leupeptin/ml. VLDL. LDL, and HDL fractions were 
isolated by sequential flotation ultracentrifugation for 18 h at 40,000 rpm in a Beckmann 
centrifuge Model LS-80M after plasma samples were adjusted with potassium bromide (ICBr) 
to solution densities of 1.006, 1.019, and 1.215 g/ml respectively. Immediately following 
ultracentrifugation, individual lipoprotein fractions were collected and dialyzed extensively 

25 against phosphate buffered saline (pH 7.4) containing 0.001% sodium azide. Protein 
concentrations were determined using standard BCA protein assays (Pierce Chemical 
Company). 



2. Dna-Binding Protocol 

30 Lipoproteins and DNA were mixed together and incubated for 30 min at room 

temperature in 50 mmole/liter Tris (pH 7.4), 100-154 mmoles/liter sodium chloride G^aCl), 15 
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mmoles/liter magnesium chloride (MgCl2). 6X Sample loading buffer (30% glycerol. 0.25% 
Xylene cyanole FF, 0.25% bromophenol blue) was added to the samples in a 1:5 VA^ ratio. 
Samples were underloaded into 30 ^il wells at the cathode edge of an 0.8% agarose gel 
containing 1 \xg ethidium bromide/ml in Tris-Acetate buffer (pH 7.85) and electrophoresis was 
5 accomplished using 100 Volt constant until the negatively charged tracking dye had migrated at 
least 50% of distance from the loading well to the anodic edge of the gel. 

. 3. Agarose Electrophoretogram of Human Lipoproteins 

Agarose electrophoresis of human lipoproteins has been performed to illustrating the 
10 differential migration patterns of lipoprotein fractions VLDL, LDL, and HDL isolated from 
human plasma resolved using non-denaturing conditions. 

Plasma lipoproteins were isolated from human blood according to the protocol described 
above. 6X Sample loading buffer (30% glycerol, 0.25% Xylene cyanole FF, 0.25% 
15 bromophenol blue) was added to the samples in a 1:5 V/V ratio. Samples were underloaded 
into 30 ^1 wells at the cathode edge of an 0.8% agarose gel in Tris-Acetate buffer (pH 7.85) 
and electrophoresis was accomplished using 100 Volt constant until the negatively charged 
tracking dye had migrated at least 50% of the distance from the loading well to the anodic edge 
of the gel. 

20 

Following electrophoresis, the agarose gel was stained for protein in a solution 
containing 50% VA^ ethanol. 10% V/V acetic acid, and 0.25% Coomasie Brilliant Blue R-250 
(CBB R-250, Bio-Rad Labs). Lane 1 contained human VLDL (10 |ig protein), Lane 2 
contained human LDL (35 |ig protein), and Lane 3 contained human HDL (35 |ig protein). 
25 Results illustrated the differential migration of lipoprotein fractions, VLDL, LDL, and HDL, 
isolated from human plasma resolved using non-denaturing conditions by agarose gel 
electrophoresis. Lipoproteins were visualized using a protein binding dye, Coomassie Brilliant 
Blue (CBB). The absence of other bands in each lane indicated the high degree of purity for 
each lipoprotein. 

30 
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4* Radioisotope Labeling of Deoxyoligonucleotides 

Complementary single stranded oligonucleotides were mixed (10 jig each) and 
incubated at 85°C for 5 min in 10 mM Tris HCl (pH 7.4). Immediately following incubation, 
the samples were cooled down slowly to room temperature to obtain double stranded 
5 oligonucleotides. The double stranded oligonucleotides were then digested with BamlU and 
EcoRl for 1 h at 37*^0 in 50 mM Tris HCl (pH 8.0), 100 mM NAGl, and 10 mM MgClj. 
Digested double stranded oligonucleotides were purified using a Qiaquick nucleotide removal 
kit from Qiagen Inc. according to manufacturer's protocol. The 5' protruding ends of the 
purified oligonucleotides were then labeled with ^^P-adATP using a Prime-It II labeling kit 
10 containing Exo (-) Klenow enzyme from Stratagene Inc. according to the manufacturer's 
protocol. The specific activity of ail oligonucleotides was determined by scintillation counting. 

The DNA-binding studies were performed as described above except that the agarose 
gel was not stained with ethidium bromide. Instead, following electrophoresis, the agarose gel 
15 was dried under vacuum and exposed to X-ray film for 4 h at room temperature prior to protein 
staining in a solution containing 50% V/V ethanol, 10% VA^ acetic acid, and 0.25% Coomassie 
Brilliant Blue R-250 (Bio-Rad Labs). Oligonucleotides and human LDL were present at 
400,000 cpm and 40 |ig protein per lane respectively. 

20 5« Sonication of plasma lipoproteins 

Solutions of plasma lipoproteins in phosphate-buffered saline containing 10 mM MgCla 
were kept on ice and sonicated for various time periods ranging from 0 to 6 minutes in a 
Sonifier Model 350 sonicator (Branson Sonic Power Co.) at the following settings: duty cycle; 
30%, pulsed, output control; level 2. Immediately following sonication, genomic DNA was 
25 added to the sonicated solutions, and the DNA-binding assay (see above) was started. 

6. RT-PCRTM of Lipoprotein-bound RNA 

Human liver RNA. complexed to human LDL or to human VLDL as described above, 
was subjected to agarose gel electrophoresis and extracted from the gel by solubilizing the gel 
30 for 20 min at 50**C in 3 times the gel volume of QX-1 buffer (Qiagen) and by twice adding an 
equivalent volume of phenol/chloroform (pH 4.0). RNA was precipitated by adding an 
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equivalent volume of 100% isopropanol and freezing the mixture overnight at -80®C. RNA 
pellets were dissolved in 50 |il of DEPC-treated water. For each reaction, the dissolved RNA (3 
^1) was transcribed in reverse into single-stranded DNA by adding 100 mM KCl, 10 mM Tris- 
HCl (pH 8.3), 5 mM MgCl2, 2,5 jiM primer (oligo d(T) or random hexamers), 1 \J/\i\ RNase 
5 inhibitor, 1 mM each of dATP, dCTP, dTTP, and dGTP, and 2.5 U/jil of MuLV reverse 
transcriptase in a total reaction volume of 20 ^1. The single-stranded DNA samples were then 
amplified in 100 mM KCl, 10 mM Tris-HCl (pH 8.3), 2 mM MgClj, 0.15 ^iM each of the 
forward and reverse ISRE primers (see Table 2), 1 mM each of dATP, dCTP, dTTP, and dGTP, 
and 2.5 U/100 jil of AmpliTaq DNA polymerase in a total reaction volume of 100 ^1. DNA 

10 amplification was carried out in a thermocycler in 30 consecutive cycles of denaturing at 95°C 
for 60 sec, reannealing at 55°C for 60 sec. primer extension at 72°C for 120 sec, and a final 
extension at 72°C for 7 min. For each PCR reaction, 10 ^1 of the reaction mixture was analyzed 
by electrophoresis on a 1% agarose gel in TBE buffer (45 mM Tris-borate and 1 mM EDTA, 
pH 8.0) while maintaining a lOO-V constant for 1 h. The PCR products were visualized by 

15 staining the gel with ethidium bromide. 

7. DNA sequencing 

DNA fragments obtained from the RT-PCR reactions were separated by electrophoresis 
on a 1% agarose gel and extracted from the gel by using a Qiagen gel extraction kit according to 
20 the manufacturer's protocol, DNA samples were analyzed on an Applied Biosystems Inc. 
model 373 automated DNA sequence apparatus after dye-terminator thermo cycle sequencing. 

8. Cell culture and transfection assays. 

Human skin fibroblasts were cultured in complete growth medium consisting of 
25 Dulbecco's modified Eagle's medium that was supplemented with 10% fetal bovine serum, 100 
|ig/ml each of streptomycin and penicillin at 37°C in an atmosphere of 5% CO2 in a humidified 
incubator. Twenty-four hours before cell transfection, during exponential growth, the cultured 
cells were harvested by trypsinization, replated at a cell density of 1 x 10* cells in 35-nim 
culture dishes containing a glass coverslip, and cultured in complete growth medium. All 
30 transfection experiments were performed in triplicate as described. 
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9. LipoFectin assay. 

The pEGFP-Nl plasmid and LipoFectin were mixed together at a ratio of 1 :4 (wt/wt) in 
200 |il of serum-free medium and incubated for 15 min at room temperature. When the cells 
reached 40 to 60% confluence, they were transfected with a mixture of 5 |ig of DNA and 20 ^ig 
5 of LipoFectin per 35-nmi cuhure dish, each dish having been diluted in 1 ml of serum-free 
medium. Transfection was performed for 16 h at 37°C. Once transfection was achieved, the 
liposomes were removed from the culture dish by gentle washing and maintained in 2 ml of 
growth medium per 35-mm culture dish for 24 h at 37°C. Expression of GFP in the cells was 
determined by fluorescence microscopy 

10 

10. LDL assay. 

The pEGFP-Nl plasmid and LDL were mixed together at a ratio of 1:10 (wt/v^) in 100 
|il of serum-free medium containing 10 mM MgCl^ and incubated for 15 min at 37*'C. When 
the cells were 40 to 60% confluent, they were transfected for 16 h at 37®C with a mixture of 5 

15 [ig of DNA and 50 ^g of LDL per 35-mm culture dish, each dish having been diluted in 1 ml of 
serum-free medium. Once transfection was achieved, the LDLs were removed by gentle 
washing and maintained in 2 ml of growth medium per 35-mm culture dish for 24 h at 37**C. 
At 24 h after transfection, the cells were washed with PBS and fixed in 2 ml of PBS containing 
4% paraformaldehyde per 35-mm culture dish for 30 min. The coverslips were then removed 

20 from the culture dishes, washed with PBS, placed in an inverted orientation on glass slides, and 
examined by fluorescent microscopy to detect GFP. 

11. In vivo reporter gene expression. 

Two-month-old female Sprague-Dawley rats were anesthetized with a combination 
25 anesthetic (42,8 mg/ml ketamine, 8.6 mg/ml xylazine, and 1.4 mg/ml acepromazine), and a 
prebound complex of purified rat LDL and linearized pEGFP-Nl plasmid DNA was injected 
intravenously (into the femoral vein), subcutaneously, intraperitoneally, and into the 
pharyngeal, nasal, and rectal mucosae (100 pg of LDL protein and 5 fig of DNA in 100 \i\ of 
PBS containing 10 mM MgClj per site). Control animals were injected with linearized pEGFP- 
30 Nl plasmid DNA in which the HCMV IE promoter sequence was interrupted only by digestion 
with restriction enzymes, 5 ^ig of DNA in 100 \x\ of PBS containing 10 mM MgCl2 per site. 
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After 2, 5, or 7 days, all the treated and control rats were sacrificed, their blood was collected by 
means of cardiac puncture, and the tissues were excised and immobilized in OCT by means of 
snap freezing over liquid nitrogen or by immediate freezing in liquid nitrogen. The 
immobilized tissue samples were sectioned on a cryomicrotome, and the sections (5-8 jim 
5 thick) were fixed for 30 min in 4% paraformaldehyde and analyzed for expression of EGFP 
(green fluorescent protein) by fluorescent microscopy. 

12. Fluorescent microscopy. 

Microscopy was performed by using an Olympus Model BH-2 fluorescent microscope 
10 (Olympus, USA) equipped with a digital camera (Hamamatsu, Model CSS 10) and a color 
printer (Image iMasten Toshiba). The filter set used was a standard fluorescein isothiocyanate 
(FITC) set (Chroma Technology, Brattleboro, VT, USA). The maximum excitation and 
emission wavelengths for this filter set were 485 nm (range 460-510 nm) and 540 nm (range 
515-565 nm), respectively. Transfection efficiency was determined by calculating the average 
15 percentage of transduced cells of five different fields per 35-mm culture dish. 

13. Detection of GFP. 

Excised rat tissues were homogenized in 150 |al of PBS in a dounce homogenizer placed 
on ice. The homogenized tissues were centrifiiged for 3 min at 13,000 x g, and 50-|il aliquots 

20 were withdrawn and used in an ELISA assay to detect GFP. First, serial dilutions (range 1:10 
to 1:1,000) of all samples were made in PBS. ELISA plates (96 wells) were coated with the 
samples (three wells/sample) by incubating the plates at room temperature for 3 h. The plated 
samples were then washed three times with 200 |il of 1 x PBS containing 0.1% Tween 20 
(PBST) and blocked with 200 jil of PBST containing 1% bovine serum albumin (BSA) for 2 h 

25 at room temperature while shaking gently. The washing procedure was repeated with 200 |li1 of 
PBST containing 0.1% BSA, and the plated samples were incubated with a 1 :2,000 dilution of a 
recombinant GFP polyclonal antibody (IgG fi-action, Clontech Inc., Palo Alto, CA) in PBST 
containing 0.1% BSA (50 |il of diluted mixture per well) for 18 h at 4^*0 while shaking gently. 
The plated samples were washed and incubated with a 1:5000 dilution of HRP-conjugated goat 

30 anti-rabbit antibody (IgG fraction, Cappel, Durham, NC) in PBST containing 0.1% BSA for 1 h 
at room temperature while shaking gently. The washing procedure was repeated and was 
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followed by a final wash with 1 x PBS. GFP was detected after a 30-min incubation at room 
temperature in PBS containing a-phenylenediamine as a chromogenic substrate. 

EXAMPLE 2 

BINDING OF HUMAN GENOMIC DNA TO HUMAN LDL 

5 The binding of human genomic DNA (hg DNA) to human LDL has also been 

demonstrated. Each lane of the agarose gel contained hg DNA cut with Alul or //mdlll. In 
addition, human VLDL and mouse LDL were run alongside the hg DNA. 

Plasma lipoproteins were isolated from human or mouse blood according to the protocol 
10 described above. DNA-binding studies were performed using human genomic DNA digested 
with either Alul or Hindlll. Following electrophoresis, the gel was stained for DNA with 
ethidium bromide prior to protein staining in a solution containing 50% VA^ ethanol, 10% VA^ 
acetic acid, and 0.25% Coomasie Brilliant Blue R-250 (CBB R-250, Bio-Rad Labs). 

15 Each lane contained 5 |.ig human genomic DNA (hg DNA) cut with Alul or Hindlll. In 

addition, human VLDL (10 |ig protein per lane) human LDL (35 jig protein per lane) and 
mouse LDL (10 )ag protein per lane) were also analysed. 

Bands in this study showed specific binding of digested human DNA fragments and 
20 human LDL by gel-shift electrophoresis. DNA fragment obtained by Alul or Hindlll digestion 
of human genomic DNA are shown to migrate toward the anode with much slower mobility 
when preincubated with human LDL but not when incubated with human VLDL, human HDL, 
or mouse LDL. The complexed DNA/lipoprotein band are first visualized using DNA-binding 
ethidium bromide and photographed using transmitted ultra-violet light for activation of the 
25 fluorescent dye. Lipoproteins were next visualized with CBB and photographed using 
transmitted visible light. The results shown in this figure indicate that aliquoti of Alul- and 
Hind Ill-digested human genomic DNA fragments comigrate with himian LDL and are 
therefore bound to human LDL. 

30 While Alul, and Hindlll were used to digest genomic DNA in the studies showm here, 

the inventors of the instant invention have also used BamHl, and Pvul for genomic DNA digest. 
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It is understood by those of skill in the art that there are many known restriction enzymes. All 
of which are capable of genomic DNA digestion resulting in DNA that can be successfully 
bound to LDL. DNA digested with Alul yields DNA of very small size (200-700 nucleotides) 
which allows isolation of the slower migrating digested DNA bound to LDL from the unbound 
5 digested DNA using agarose gel electrophoresis. Digestion of genomic DNA with Hindlll 
yields genomic DNA of greater average size (1000-7000 nucleotides) which reaches the upper 
size limit for separation by agarose gel electrophoresis (the technique used here), however there 
are other known DNA separation techniques which would work similarly to accomplish the 
goal of separating free DNA from DNA bound to LDL. The choice of which separation 
10 technique to use is dependent only on the size of the DNA fragments resulting after digestion. 
In principal, undigested genomic DNA would also work. 

EXAMPLES 
BINDING OF PLASMID DNA TO HUMAN LDL 

15 Plasma LDL were isolated from human blood according to the protocol previously 

described in Example 1. DNA-binding studies were using DNA (pBluescript II KS, Stratagene 
Inc.) digested with Pvu I. Following electrophoresis, the agarose gel was stained for DNA with 
ethidium bromide prior to protein staining in a solution containing 50% VA^ ethanol, 10% W/V 
acetic acid, and 0.25% Coomassie Brilliant Blue R-250 (CBB R-250. Bio-Rad Labs). The 

20 binding of plasmid DNA to human LDL was shown in agel which contained contains 0.5 pg 
molecular size DNA marker (Lane 1); 2 ng pKS DNA cut with Pvu I (Lanes 2-4); 35 ^ig human 
LDL (Lane 3) and 70 ^ig human LDL protein (Lane 4). 

Results of the electrophoretogram illustrated specific binding of Pvul digested plasmid 
25 DNA (pBluescript II KS, Stratagene Inc.) and human LDL. Increased amounts of human LDL 
also caused an increase of DNA shifted to the LDL location and a decrease of the free Pvu I 
digested DNA band. Co-migration of the Pvu I digested DNA and human LDL are proof of a 
physical complex composed of LDL and DNA. 
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EXAMPLE 4 
BINDING OF CMV PROMOTER-REGULATORY 
SEQUENCES TO HUMAN LDL 
Plasma lipoproteins were isolated from human or mouse blood according to the protocol 
5 previously described in Example 1. DNA-binding studies were performed using plasmid DNA 
(either pBluescript II KS or pBKCMV, Stratagene Inc.) digested with BamUL Following 
electrophoresis the agarose gel was stained for DNA with ethidium bromide prior to protein 
staining in a solution containing 50% VA^ ethanol, 10% VA^ acetic acid, and 0.25% Coomassie 
Brilliant Blue R-250 (CBB R-250, Bio-Rad Labs). Loading quantities per lane were as follows: 
10 plasmid DNA: 1 \xg DN A/lane 

human VLDL 35 |ig protein/lane 

human LDL 35 ^ig protein/lane 

mouse VLDL: 8 \ig protein/lane 

mouse LDL: 35 jig protein/lane 

15 This study used BamHl cut pIGS, BamHl cut pBKCMV, human VLDL. human LDL. mouse 
VLDL and mouse LDL. 



A comparison of human LDL complexed with BamHl linearized plasmids, pBluescript 
II KS or pBKCMV. The inventors' results illustrated that specific binding of BamHl linearized 

20 plasmid DNA and human LDL occurs, but these BamHl linearized plasmids do not complex 
with either human VLDL, mouse VLDL or mouse LDL under the conditions previously 
described in the DNA-binding protocol (Example 2). Further, enhanced binding of human LDL 
and the BamHl linearized plasmid pBKCMV DNA which contains the cytomegalovirus 
promoter region SEQ ID NO:225 (Table 2) was observed as compared to the BamHl linearized 

25 plasmid pBluescript II KS DNA that does not contain the cytomegalovirus promoter region 
(lane 3). Because binding of DNA by LDL is enhanced in the presence of the CMV promoter, 
it is possible that 'LDL binds specifically to the CMV promoter sequence (SEQ ID NO:225, see 
Table 2). 



30 Aliquots containing approximately 8 ^ig mouse VLDL protein were used in each DNA- 

binding assay mixtures resolved in lanes 4 and 9 as compared to 35 |ig of total protein of all 
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other lipoproteins (lanes 2, 3, 5, 7, 8, and 10). Due to the low physiological concentration of 
VLDL in mouse plasma and the limited loading capacity of the gel, it was not possible to load 
35 Jig of mouse VLDL protein per lane. Therefore, this study does not allow for a quantitative 
comparison of the plasmid DNA-binding capacity of mouse VLDL v^. human VLDL, human 
LDL, and mouse LDL. 



TABLE 2 

Nucleotide Sequence of the Promoter Region (1300-1900) of the Human Cytomegalovirus 

SEQ ID NO:225 

GGATCTGACG GTTCACTAAA CCAGCTCTGC TTATATAGAC CTCCCACCGT 

ACACGCCTAC CGCCCATTTG CGTCAATGGG GCGGAGTTGT TACGACATTT 

TGGAAAGTCC CGTTGATTTT GGTGCCAAAA CAAACTCC AT TGACGTCAAT 

GGGGTGGAGA CTTGGAAATC CCCGTGAGTC AAACCGCTAT CCACGCCCAT 

T GATGTACTG CCAAA ACCGC ATCACCATGG TAATAGCGAT GACTAATACG 

TA GATGTACT GCCAAGT AGG AAAGTCCCAT AAGGTCATGT ACTGGGCATA 

ATGCCAGGCG GGCCATTTAC CGT CATTGAC GTCAATAGGG GGCGTA CTTG 

GCATATGATA CACTTGATGT ACTGCCAAGT GGGCAGTTTA CCGTAAATAC 

TCCACCCATT GACGTCAATG GAAAGTCCCT ATTGGCGTTA CTATGGGAAC 

ATACGTCATT ATTGACGTCA ATGGGCGGGG GTCGTTGGGC GGTCAGCCAG 

GCGGGCCATT TA CCGTAAG T TATGTAACGC GGAACTCCAT ATATGGGCTA 
TGAACTAATG ACCCCGTAAT TGATTACTAT TAATAACTA 

Major repeat regions are indicate in bold and underlined. 
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EXAMPLE 5 
BINDING OF SRE, E/C, FAS, AND ISRE 
DEOXYNUCLEOTIDE SEQUENCES TO HUMAN LDL 
Plasma lipoproteins were isolated from human or mouse blood according to the protocol 
5 previously described in Example 1 . DNA-binding studies were performed using the synthetic 
oligonucleotides: SRE, E/C. and FAS (see Table 3 for nucleotide sequences). 

TABLE 3 

Deoxyribonucleic Acid Sequences of Synthetic Oligonucleotides 
1 0 used in Binding Studies with LDL 

SEQIDNO OligoName Sequence (5'-3') 

226 SRE-2A GATCCAAATCACCCACTGCAACTCCTCCCCCTGCG 

227 E/C-IA GATCCATCCAATTGGGCTVATCAGGAG 

228 FAS- lA GATCCGGTCTCCAATTGG 

229 ISRE- lA GATCCTCGGGAAAGGGAAACCGAAACTGAAGCCG 



DNA-binding studies were performed according to the previously described DNA- 
binding protocol (Example 2). Following electrophoresis, the agarose gel was stained for DNA 
with ethidium bromide prior to protein staining in a solution containing 50% VA^ ethanol, 1096 
15 VA^ acetic acid, and 0.25% Coomassie Brilliant Blue R-250 (CBB R-250, Bio-Rad Labs). 
Oligonucleotides were present at 1 ^g DNA per lane. Lanes containing human LDL contained 
35 |ig protein per lane and lanes containing mouse LDL contained 15 ixg protein per lane. 

The data generated showed the complexed synthetic, double-stranded oligonucleotide 
20 fragments and human LDL, The results strongly support that human LDL binds to these DNA 
sequences in a highly specific fashion. The synthetic oligonucleotides SRE-2A, E/C-IA, F AS- 
IA, and ISRE-IA (Table 3, SEQ ID NO:226, SEQ ID NO:227, SEQ ID NO:228, and SEQ ID 
NO:229 respectively) bind to human LDL but do not bind to mouse LDL. DNA binding to 
human LDL is illustrated by the appearance of a fraction of slower mobility DNA that 
25 comigrates with human LDL. 
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In another embodiment of this same study, binding was determined using radioisotope 
labeling of the deoxynucleotide sequences as described in Example 1 . The results from these 
DNA-binding studies show that human LDL binds to the synthetic oligonucleotides SRE-2A, 
5 E/C-IA, FAS-IA, and ISRE-IA (Table 3, SEQ ID NO:226; SEQ ID NO:227; SEQ ID NO:228; 
SEQ ID NO:229) in a highly specific fashion. DNA binding to human LDL is illustrated by the 
appearance of a fraction of slower mobility DNA that comigrates with human LDL. The 
binding affinity of the different synthetic oligonucleotides for human LDL can be determined 
by kinetic binding studies using quantitative autoradiography well known to those of skill in the 
10 art. 

EXAMPLE 6 

BINDING OF VARIOUS NUCLEOTIDE SEQUENCES TO 
THE LDL ISOLATED FROM VARIOUS SPECIES 

15 Plasma lipoproteins were isolated from human, mouse, rat, or baboon blood according 

to the protocol previously described in Example 1. DNA-binding studies were performed 
according to the previously described DNA-binding protocol using the synthetic 
oligonucleotides: SRE, E/C, and FAS (see Table 3 for nucleotide sequences), genomic DNA, or 
plasmid DNA containing the CMV promoter. A summary of the binding studies of the instant 

20 invention are illustrated in Tables 4A and 4B, below. Table 4A illustrates the binding of 
human, mouse, rat and baboon LDL to various forms and sources of DNA, and Table 4B 
illustrates the DNA/LDL complexes made thus far. 

TABLE 4A 

25 Binding of Human, Mouse, Rat and Baboon LDL to Various Forms of DNA 

DNA human LDL mouse LDL rat LDL baboon LDL 

hgDNA YES NO YES YES 

mgDNA N.D. N.D. YES N.D. 
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rgDNA N.D. N.D. YES N.D. 

bgDNA N.D. N.D. N.D. YES 

CMV YES NO YES YES 

SRE YES NO N.D. NO 

E/C YES NO N.D. NO 

FAS YES NO N.D. NO 



hg = human genomic DNA (digested with either Alul or Hindlll, mg = mouse genomic 
DNA digested with either AM or Hindlll, rg = rat genomic DNA digested with either Alul or 
HindllL and bg = baboon genomic DNA digested with either Alul or Hindlll 

Yes = binding, NO = no binding, N.D. = binding not determined 
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TABLE 4B 

Specific LDL/DNA Complexes That Have Been Made 



DNA 


DNA Digested With 


LDL 


human genomic 


Alul 


human 


human genomic 


HindlU 


human 


human genomic 


Bam HI 


human 


human genomic 


Pvu\ 


human 


human genomic 


Alul 


rat 


human genomic 


Hindlll 


rat 


human genomic 


Bam HI 


rat 


human genomic 


Pvu\ 


rat 


human genomic 


Alul 


baboon 


human genomic 


Hindlll 


baboon 


human genomic 


Bam HI 


baboon 


human genomic 


Pvu\ 


baboon 


mouse genomic 


Alul 


rat 


mouse genomic 


Hindlll 


rat 


rat genomic 


Alul 


rat 


rat genomic 


Hindlll 


rat 


baboon genomic 


AM 


baboon 


baboon genomic 


Hindlll 


baboon 


pBSKS 


Pvul 


human 


pBSKS 


Bam HI 


human 


pBKCMV 


Bam HI 


human 


pBKCMV 


Bam HI 


rat 
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TABLE 4B (cont'd) 



DNA 


DNA Digested With 


LDL 


pBKCMV 


Bamm 


baboon 


SRE-2A oligo 


none 


human 


SEQIDNO:226 






E/C-IA oligo 


none 


human 


SEQ IDNO:227 






FAS-IA oligo 


none 


human 


SEQ IDNO:228 






ISRE-lAoligo 


none 


human 


SEQ ID NO:229 







EXAMPLE 7 

DETECTION OF LDL-BOUND DNA IN HUMAN BLOOD 

5 Plasma lipoproteins are isolated from human blood according to the protocol previously 

described in Example 1. 6X Sample loading buffer (30% glycerol, 0.25% Xylene cyanole FF, 
0.25% bromophenol blue) is added to the samples in a 1:5 VA^ ratio. Samples are underloaded 
into 30 III wells at the cathode edge of an 0.8% agarose gel in Tris-Acetate buffer (pH 7.85) and 
electrophoresis is accomplished using 100 Volt constant until the negatively charged tracking 

10 dye migrates at least 50% of the distance from the loading well to the anodic edge of the gel. 
Following electrophoresis, is stained for DNA with ethidium bromide prior to protein staining 
in a solution containing 50% VA^ ethanol, 10% W/W acetic add, and 0.25% Coomasie Brilliant 
Blue R-250 (CBB R-250, Bio-Rad Labs). If no DNA is detected by ethidium bromide staining, 
the agarose gel is subjected to Southern blot analysis using a labeled DNA probe. The DNA is 

15 labeled with a radioactive isotope (e.g., "^^P), a non-radioactive tag (DIG) or with any other 
standard DNA-labeling method known to one of skill in the art. Randomly synthesized, short 
oligonucleotides are used as the probe to detect, in a general fashion, whether or not DNA is 
bound to the isolated LDL. Controls include lanes containing known quantities of DNA, lanes 
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containing purified LDL digested with DNase L and LDL bound to DNA made by mixing 
purified LDL and DNA according to the method described in Example 2. 

LDL isolated from humans with cancer and subjected to the above protocol will have 
5 detectable DNA bound to the LDL in quantities greater than the amount of DNA bound to LDL 
isolated from humans without cancer. 

EXAMPLE 8 

DETECTION OF SPECIFIC TYPES OF CANCERS WITH 
1 0 SEQUENCE SPECIFIC DNA PROBES 

Not only is it possible to identify the presence or absence of cancer in a living body 
using the invention technique (as described in Example 14 above), it is also possible to identify 
specific cancer types by using sequence specific DNA probes. For example, LDL-boimd DNA 
isolated from a patient with colon cancer will have a different DNA sequence than the LDL- 

15 bound DNA isolated from a patient with a different cancer type, for example, breast cancer. 
Different DNA sequences bound to the LDL isolated from different cancer patients is 
determined by first isolating LDL from the blood of a person with an independently identified 
and known cancer type, using the protocol in Example 1 . This isolated LDL is then digested 
with various non-specific proteases to remove the LDL while retaining the DNA. This DNA is 

20 then sequenced using standard sequencing techniques. A list of the DNA sequences along with 
the type of cancer it is associated with is made. This list is then used to synthesize probes that 
can differentiate among the various types of cancer. These probes are used in screening of a 
patient with an unknown cancer type, or in the early detection of metastatic cancer, or as a 
general early screening technique for the presence or absence of specific cancer types. 

25 
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EXAMPLE 9 

METHODS FOR THE DETERMINATION OF METASTATIC GENE TRANSFER VIA 

LIPOPROTEINS AS NATIVE VECTORS 

In order to determine the sequence of polynucleotides bound to endogenous LDL, 
5 plasma LDL and other apoB-conlaining lipoproteins are captured using a monoclonal antibody 
to a specific apoB epitope such as 2G8 which is immobilized on an inert, hydrophilic and 
highly porous polymer microbead. The LDL-DNA complex is then isolated by elution using 
affinity chromatography technology. DNA is further purified from the isolated LDL/DNA 
complex using standard DNA purification methodology such as phenol/chloroform extraction 

10 followed by ethanol precipitation. Ahematively, purified DNA is isolated from the affinity 
column using elution conditions that disrupt protein/DNA complexes but not protein/protein 
complexes (/.e., antibody/LDL complex). The polynucleotide sequences are determined using 
the SRE, E/C, FAS, and ISRE-IA oligonucleotides (SEQ ID NO:226, SEQ ID NO:227, SEQ 
ID NO:228, and SEQ ID NO:229, respectively) in a standard PGR™ methodology in order to 

15 amplify polynucleotides with unknown sequences. The amplified PCR*^*"^ products (/.c, 
polynucleotides) are then isolated by agarose gel electrophoresis and subsequent DNA 
sequencing techniques well known to the art. 

Alternatively, identification of polynucleotide sequences that are bound to endogenous 
20 human LDL is via the specific binding of LDL to a plastic matrix such as a 96 well ELISA 
(enzyme linked inununosorbani assay) plates coated with specific antibodies that bind to human 
LDL. In this embodiment, freshly isolated plasma containing endogenous lipoproteins is used 
to bind to the anti-human LDL antibodies using standard ELISA procedures lipoproteins to the 
art. The presence and specific sequence of polynucleotides prebound to the endogenous LDL in 
25 each is determined by PGR™ technology. 

Because many varying and different embodiments may be made within the scope of the 
inventive concept herein taught, and because many modifications may be made in the 
embodiments herein detailed in accordance with the descriptive requirement of the law, it is to 
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be understood that the details herein are to be interpreted as illustrative and not in a limiting 
sense. 

EXAMPLE 10 

5 LOW-DENSITY LIPOPROTEIN INTERACTS WITH 

HUMAN CYTOMEGALOVIRUS GENOMIC DNA 

DNA binding experiments with purified plasma lipoprotein fractions and human 
genomic DNA as well as several different plasmids indicate that purified LDL binds to human 
genomic DNA digested with different restriction enzymes (Alu I and Hind III). 

10 

Purified LDL also bound to several different plasmids but its binding affinity for 
plasmid DNA containing the HCMV IE promoter region was significantly higher. It was 
shown that the binding of both LDL and VLDL to the HCMV IE promoter region and SRE, 
MSRE, ISRE, MISRE, E/C, FAS, and MFAS oligonucleotides. The E/C oligonucleotide was 

15 used in these DNA binding studies because this oligonucleotide contains both a binding site for 
members of the C/EBP transcription factor family, which are involved in the regulation of 
differentiation-dependent adipocyte gene expression, as well as an overlapping E-box motif 
which is generally recognized by the eukaryotic basic helix-loop-helix (b-HLH) transcriptional 
regulators. LDL clearly have a greater affinity for all of the oligonucleotides tested than do 

20 VLDL. This is most likely due to interference with protein-DNA interaction caused by either 
the presence of other apolipoproteins on the surface of VLDL or an increased net charge as a 
result of the increased lipid content of VLDL. 

The sequence specificity is illustrated by the fact that both LDL and VLDL show a 
25 decreased binding affinity for the mutated versions of the ISRE and FAS oligos (MISRE and 
MFAS respectively). In contrast, LDL showed an increased binding affinity for the mutated 
version of the SRE oligo (MSRE). It is possible that this mutated SRE sequence may be a 
better ligand for the putative DNA binding region of apo B present on LDL. The binding of 
both VLDL and LDL to the E/C oligonucleotide is not surprising since this oligo contains the 
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E-box motif which is a known binding site for b-HLH proteins and similar b-HLH regions have 
been identified in apoB present on VLDL and LDL. 

The affinity for the HCMV IE promoter is not immediately obvious since careful 
analysis does not reveal an exact copy of either a SRE, ISRE, FAS, or E/C sequence. However, 
the HCMV IE promoter region contains regulatory elements that are generally recognized by a 
large number of eukaryotic DNA-binding proteins, including a variety of different families of 
transcription factors, and it may therefore be possible that the identified b-HLH regions of apoB 
possess similar DNA binding properties. 

Another possibility is that other yet unidentified regions of apoB are involved in the 
binding to the HCMV IE promotor region. The fact that HDL in contrast to VLDL and LDL do 
not bind to any of the oligos tested suggests that the DNA binding results from the specific 
interaction with apo B. These data support the hypothesis that apo B contains DNA binding 
domains which show homology with the DNA binding domains of SREBP-1. SREBP-2, ADD- 
1, and ISGFSy and that apo B containing lipoproteins therefore bind to specific nucleotide 
sequences similar to those bound by these known DNA binding proteins. 

Recent reports suggest a possible causal relationship between human cytomegalovirus 
(HCMV) and the development of atherosclerosis in humans. These reports together with data 
presented herein, which show that human LDL binds strongly to HCMV IE promoter 
sequences, led the inventors to investigate whether plasma LDL may play a role in the 
pathogenesis of HCMV induced atherosclerosis. 

To test this hypothesis, the inventors looked for HCMV DNA sequences in the purified 
plasma LDL fraction of human subjects who tested seropositive for HCMV by polymerase 
chain reaction (PCR). The results of these studies show that a PCR product of the expected size 
(170 bp) could be detected with both primer sets (MTR2 and IE) in the purified plasma LDL 
fraction of HCMV seropositive subjects. However, this 170 bp DNA fragment could not be 
detected in the plasma samples of these subjects (lanes 6-8). These data suggest that the use of 
purified plasma LDL fractions for detection of CMV nucleic acid sequences by PCR techniques 
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is more sensitive than when whole plasma samples are used. Furthermore, the increased yield 
of PGR products of the purified plasma LDL fractions strongly suggest that HCMV DNA is 
predominantly associated with LDL within the plasma pool of HCMV seropositive subjects. 

EXAMPLE 11 
LOW-DENSITY LIPOPROTEIN AS A 
NATURAL GENE TRANSFER VECTOR 

The discovery of the nucleic acid-binding properties apo B-lOO suggested that 
lipoproteins containing apoBlOO, as naturally occurring liposomes, may function as gene 
transfer agents. By using highly purified low-density lipoprotein as such an agent, the inventors 
were able to transfect cultured human skin fibroblasts in vitro and to express a green fluorescent 
protein reporter gene in vivo. The gene transfer mediated by low-density lipoprotein was more 
efficient that that mediated by LipoFectin. Low-density lipoprotein also did not exhibit any 
toxicity, immunogenicity, or serum inhibition. 

L DNA-binding 

In the Examples above, it was shown that highly purified human LDL binds to nucleic 
acids in a specific fashion. In order to establish whether rat lipoproteins can bind nucleic acids 
in a similar fashion, DNA-binding experiments with different rat lipoprotein fractions were 
performed. A gel shift assay of linearized pBluescript KS and pBKCMV plasmid DNA and 
purified rat VLDL, LDL, and HDL fractions was performed. The data clearly demonstrate that 
the binding of nucleic acids is specific to the purified LDL fraction. 

The binding of LDL to DNA is exhibited by the retarded electrophoretic migration of 
DNA in agarose gel that is caused by the fonnation of complexes of higher molecular weight. 
In contrast, purified fractions of VLDL and HDL did not bind any of the DNA samples tested. 
The fact that purified HDL did not bind DNA was expected, since endogenous HDL does not 
contain apo B-100. Surprisingly, there was no apparent binding of DNA to apo B-100- 
containing VLDL. It is possible that the DNA-binding assay, which employs ethidium bromide 
staining to detect DNA, lacks sensitivity or that VLDL does not bind to DNA under the 
conditions of the DNA-binding assay. Another explanation could be a difference in the 
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conformation of apo B-lOO present on LDL as opposed to VLDL because of a difference in the 
lipid composition and protein content of the two lipoprotein fractions. 

2. In vitro cell transfection studies. 
S Based on the findings of the DNA-binding assay, transfection studies were performed 

using a prebound complex of LDL and plasmid DNA that contained a reporter gene that 
encodes GFP. 

The data generated illustrated the successful transfection of how human skin fibroblasts 
10 with LDL and pEGFP-Nl plasmid DNA. The transfection process was monitored by 

expression of the GFP encoding gene and is driven by the HCMV IE promoter. In addition to 
fluorescent microscopic analysis, expression of GFP was confirmed by a qualitative ELISA 
using a primary antibody against recombinant GFP and an HRP-conjugated secondary antibody 
with a-phenylenediamine as a chromogenic substrate, 

15 

Human skin fibroblasts transfected with LDL exhibited a significantly lower intensity of 
green fluorescence than did cells transfected with LipoFectin, indicating that the level of GFP 
expression was lower in these LDL-transfected cells. When the percentage of positively 
transfected cells were compared, however, transfection with LDL yielded a higher percentage of 
20 transfected cells than did transfection with LipoFectin (20 to 30% and 60 to 70%, respectively). 
In addition, LipoFectin-mediated transfection resulted in green fluorescence in the cell 
cytoplasm and in the nuclei, whereas LDL-mediated transfection resulted in green fluorescence 
predominantly in the cytoplasm. 

25 Transfection assays in which LDL concentrations were as high as 250 g/ml of LDL 

protein produced no detectable effects on the confluence and viability of the cell cultures, 
whereas LipoFectin concentrations of 20 g/ml resulted in significant loss of cell viability. 
Control cells that were transfected with linearized pEGFP-Nl plasmid DNA only exhibited no 
fluorescence. 

30 
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3* In vivo reporter gene expression. 

To evaluate whether LDL could be used as a vehicle for in vivo gene delivery, a 
prebound rat LDL-pEGFP-Nl complex was administered to 2-month-old female Sprague- 
Dawley rats. Cryosections of the liver and heart tissues of the treated animals that had been 
5 excised 2 days after the LDL-pEGFP-Nl complex showed significant levels of green 
fluorescence indicative of EGFP expression as determined by fluorescent microscopy. 

The expression of GPP in the different tissues was confirmed by a qualitative ELISA 
using a primary antibody against recombinant GPP and an HRP-conjugated secondary antibody 
10 with a-phenylenediamine as a chromogenic substrate. In contrast, only low levels of 
autofluorescence were observed in the cryosectioned tissues obtained from the control animals 
treated solely with linearized pEGFP-Nl DNA. These data demonstrate that purified LDL can 
be used in a prebound complex with DNA as an in vivo gene delivery system. 

15 ♦ * « 

All of the compositions and/or methods disclosed and claimed herein can be made and 
executed without undue experimentation in light of the present disclosure. While the 
compositions and methods of this invention have been described in terms of preferred 

20 embodiments, it will be apparent to those of skill in the art that variations may be applied to the 
compositions and/or methods and in the steps or in the sequence of steps of the method 
described herein without departing from the concept, spirit and scope of the invention. More 
specifically, it will be apparent that certain agents which are both chemically and 
physiologically related may be substituted for the agents described herein while the same or 

25 similar results would be achieved. All such similar substitutes and modifications apparent to 
those skilled in the art are deemed to be within the spirit, scope and concept of the invention as 
defined by the appended claims 
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(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 
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<vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/874,807 

(B) FILING DATE: 13-JUN-1997 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4536 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Glu Glu Glu Met Leu Glu Asn Val Ser Leu Val Cys Pro Lys Asp Ala 
15 10 15 

Thr Arg Phe Lys His Leu Arg Lys Tyr Thr Tyr Asn Tyr Glu Ala Glu 
20 25 30 

Ser Ser Ser Gly Val Pro Gly Thr Ala Asp Ser Arg Ser Ala Thr Arg 
35 40 45 

He Asn Cys Lys Val Glu Leu Glu Val Pro Gin Leu Cys Ser Phe He 

50 55 60 

Leu Lys Thr Ser Gin Cys Thr Leu Lys Glu Val Tyr Gly Phe Asn Pro 
65 70 75 80 

Glu Gly Lys Ala Leu Leu Lys Lys Thr Lys Asn Ser Glu Glu Phe Ala 
85 90 95 

Ala Ala Met Ser Arg Tyr Glu Leu Lys Leu Ala He Pro Glu Gly Lys 
100 105 110 

Gin Val Phe Leu Tyr Pro Glu Lys Asp Glu Pro Thr Tyr He Leu Asn 
115 120 125 

He Lys Arg Gly He He Ser Ala Leu Leu Val Pro Pro Glu Thr Glu 
130 135 140 

Glu Ala Lys Gin Val Leu Phe Leu Asp Thr Val Tyr Gly Asn Cys Ser 
145 150 155 160 

Thr His Phe Thr Val Lys Thr Arg Lys Gly Asn Val Ala Thr Glu He 
165 170 175 

Ser Thr Glu Arg Asp Leu Gly Gin Cys Asp Arg Phe Lys Pro He Arg 
180 185 190 

Thr Gly He Ser Pro Leu Ala Leu He Lys Gly Met Thr Arg Pro Leu 
195 200 205 
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Ser Thr Leu lie Ser Ser Ser Gin Ser Cys Gin Tyr Thr Leu Asp Ala 
210 215 220 

Lys Arg Lys His Val Ala Glu Ala lie Cys Lys Glu Gin His Leu Phe 
225 230 235 240 

Leu Pro Phe Ser Tyr Asn Asn Lys Tyr Gly Met Val Ala Gin Val Thr 
245 250 255 

Gin Thr Leu Lys Leu Glu Asp Thr Pro Lys He Asn Ser Arg Phe Phe 
260 265 270 

Gly Glu Gly Thr Lys Lys Met Gly Leu Ala Phe Glu Ser Thr Lys Ser 
275 280 285 

Thr Ser Pro Pro Lys Gin Ala Glu Ala Val Leu Lys Thr Leu Gin Glu 
290 295 300 

Leu Lys Lys Leu Thr He Ser Glu Gin Asn He Gin Arg Ala Asn Leu 
305 310 315 320 

Phe Asn Lys Leu Val Thr Glu Leu Arg Gly Leu Ser Asp Glu Ala Val 
325 330 335 

Thr Ser Leu Leu Pro Gin Leu He Glu Val Ser Ser Pro He Thr Leu 
340 345 350 

Gin Ala Leu Val Gin Cys Gly Gin Pro Gin Cys Ser Thr His He Leu 
355 360 365 

Gin Trp Leu Lys Arg Val His Ala Asn Pro Leu Leu He Asp Val Val 
370 375 380 

Thr Tyr Leu Val Ala Leu He Pro Glu Pro Ser Ala Gin Gin Leu Arg 
385 390 395 400 

Glu He Phe Asn Met Ala Arg Asp Gin Arg Ser Arg Ala Thr Leu Tyr 
405 410 415 

Ala Leu Ser His Ala Val Asn Asn Tyr His Lys Thr Asn Pro Thr Gly 
420 425 430 

Thr Gin Glu Leu Leu Asp He Ala Asn Tyr Leu Met Glu Gin He Gin 
435 440 445 

Asp Asp Cys Thr Gly Asp Glu Asp Tyr Thr Tyr Leu He Leu Arg Val 
450 455 460 

He Gly Asn Met Gly Gin Thr Met Glu Gin Leu Thr Pro Glu Leu Lys 
465 470 475 480 



Ser Ser He Leu Lys Cys Val Gin Ser Thr Lys Pro Ser Leu Met He 
485 490 495 
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Gln Lys Ala Ala lie Gin Ala Leu Arg Lys Met Glu Pro Lys Asp Lys 
500 505 510 

Asp Gin Glu Val Leu Leu Gin Thr Phe Leu Asp Asp Ala Ser Pro Gly 
515 520 525 

Asp Lys Arg Leu Ala Ala Tyr Leu Met Leu Met Arg Ser Pro Ser Gin 
530 535 540 

Ala Asp He Asn Lys He Val Gin He Leu Pro Trp Glu Gin Asn Glu 
545 550 555 560 

Gin Val Lys Asn Phe Val Ala Ser His He Ala Asn He Leu Asn Ser 
565 570 575 

Glu Glu Leu Asp He Gin Asp Leu Lys Lys Leu Val Lys Glu Ala Leu 
580 585 590 

Lys Glu Ser Gin Leu Pro Thr Val Met Asp Phe Arg Lys Phe Ser Arg 
595 600 605 

Asn Tyr Gin Leu Tyr Lys Ser Val Ser Leu Pro Ser Leu Asp Pro Ala 
610 615 620 

Ser Ala Lys He Glu Gly Asn Leu He Phe Asp Pro Asn Asn Tyr Leu 
625 630 635 640 

Pro Lys Glu Ser Met Leu Lys Thr Thr Leu Thr Ala Phe Gly Phe Ala 
645 650 655 

Ser Ala Asp Leu He Glu He Gly Leu Glu Gly Lys Gly Phe Glu Pro 
660 665 670 

Thr Leu Glu Ala Leu Phe Gly Lys Gin Gly Phe Phe Pro Asp Ser Val 
675 680 685 

Asn Lys Ala Leu Tyr Trp Val Asn Gly Gin Val Pro Asp Gly Val Ser 
690 695 700 

Lys Val Leu Val Asp His Phe Gly Tyr Thr Lys Asp Asp Lys His Glu 
705 710 715 720 

Gin Asp Met Val Asn Gly He Met Leu Ser Val Glu Lys Leu He Lys 
725 730 735 

Asp Leu Lys Ser Lys Glu Val Pro Glu Ala Arg Ala Tyr Leu Arg He 
740 745 750 

Leu Gly Glu Glu Leu Gly Phe Ala Ser Leu His Asp Leu Gin Leu Leu 
755 760 765 

Gly Lys Leu Leu Leu Met Gly Ala Arg Thr Leu Gin Gly He Pro Gin 
770 775 780 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 



PCT/US98/11927 



-95- 

Met He Gly Glu Val He Arg Lys Gly Ser Lys Asn Asp Phe Phe Leu 
785 790 795 800 

His Tyr He Phe Met Glu Asn Ala Phe Qlu Leu Pro Thr Gly Ala Gly 
805 810 815 

Leu Gin Leu Gin He Ser Ser Ser Gly Val He Ala Pro Gly Ala Lys 
820 825 830 

Ala Gly Val Lys Leu Glu Val Ala Asn Met Gin Ala Glu Leu Val Ala 
835 840 845 

Lys Pro Ser Val Ser Val Glu Phe Val Thr Asn Met Gly He He He 
850 855 860 

Pro Asp Phe Ala Arg Ser Gly Val Gin Met Asn Thr Asn Phe Phe His 
865 870 875 880 

Glu Ser Gly Leu Glu Ala His Val Ala Leu Lys Ala Gly Lys Leu Lys 
885 890 895 

Phe He He Pro Ser Pro Lys Arg Pro Val Lys Leu Leu Ser Gly Gly 
900 905 910 

Asn Thr Leu His Leu Val Ser Thr Thr Lys Thr Glu Val He Pro Pro 
915 920 925 

Leu He Glu Asn Arg Gin Ser Trp Ser Val Cys Lys Gin Val Phe Pro 
930 935 940 

Gly Leu Asn Tyr Cys Thr Ser Gly Ala Tyr Ser Asn Ala Ser Ser Thr 
945 950 955 960 

Asp Ser Ala Ser Tyr Tyr Pro Leu Thr Gly Asp Thr Arg Leu Glu Leu 
965 970 975 

Glu Leu Arg Pro Thr Gly Glu He Glu Gin Tyr Ser Val Ser Ala Thr 
980 985 990 

Tyr Glu Leu Gin Arg Glu Asp Arg Ala Leu Val Asp Thr Leu Lys Phe 
995 1000 1005 

Val Thr Gin Ala Glu Gly Ala Lys Gin Thr Glu Ala Thr Met Thr Phe 
1010 1015 1020 



Lys Tyr Asn Arg Gin Ser Met Thr 
1025 1030 

Asp Phe Asp Val Asp Leu Gly Thr 
1045 

Thr Glu Gly Lys Thr Ser Tyr Arg 
1060 



Leu Ser Ser Glu Val Gin He Pro 
1035 1040 

He Leu Arg Val Asn Asp Glu Ser 
1050 1055 

Leu Thr Leu Asp He Gin Asn Lys 
1065 1070 
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Lys lie Thr Glu Val Ala Leu Met Gly His Leu Ser Cys Asp Thr Lys 
1075 1080 1085 

Glu Glu Arg Lys He Lys Gly Val He Ser He Pro Arg Leu Gin Ala 
1090 1095 1100 

Glu Ala Arg Ser Glu He Leu Ala His Trp Ser Pro Ala Lys Leu Leu 
1105 1110 1115 1120 

Leu Gin Met Asp Ser Ser Ala Thr Ala Tyr Gly Ser Thr Val Ser Lys 
1125 1130 1135 

Arg Val Ala Trp His Tyr Asp Glu Glu Lys He Glu Phe Glu Trp Asn 
1140 1145 1150 

Thr Gly Thr Asn Val Asp Thr Lys Lys Met Thr Ser Asn Phe Pro Val 
1155 1160 1165 

Asp Leu Ser Asp Tyr Pro Lys Ser Leu His Met Tyr Ala Asn Arg Leu 
1170 1175 1180 

Leu Asp His Arg Val Pro Glu Thr Asp Met Thr Phe Arg His Val Gly 
1185 1190 1195 1200 

Ser Lys Leu He Val Ala Met Ser Ser Trp Leu Gin Lys Ala Ser Gly 
1205 1210 1215 

Ser Leu Pro Tyr Thr Gin Thr Leu Gin Asp His Leu Asn Ser Leu Lys 
1220 1225 1230 

Glu Phe Asn Leu Gin Asn Met Gly Leu Pro Asp Phe His He Pro Glu 
1235 1240 1245 

Asn Leu Phe Leu Lys Ser Asp Gly Arg Val Lys Tyr Thr Leu Asn Lys 
1250 1255 1260 

Asn Ser Leu Lys He Glu He Pro Leu Pro Phe Gly Gly Lys Ser Ser 
1265 1270 1275 1280 

Arg Asp Leu Lys Met Leu Glu Thr Val Arg Thr Pro Ala Leu His Phe 
1285 1290 1295 

Lys Ser Val Gly Phe His Leu Pro Ser Arg Glu Phe Gin Val Pro Thr 
1300 1305 1310 

Phe Thr He Pro Lys Leu Tyr Gin Leu Gin Val Pro Leu Leu Gly Val 
1315 1320 1325 

Leu Asp Leu Ser Thr Asn Val Tyr Ser Asn Leu Tyr Asn Trp Ser Ala 
1330 1335 1340 

Ser Tyr Ser Gly Gly Asn Thr Ser Thr Asp His Phe Ser Leu Arg Ala 
1345 1350 1355 1360 
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Arg Tyr His Met hys Ala Asp Ser Val Val Asp Leu Leu Ser Tyr Asn 
1365 1370 1375 

Val Gin Gly Ser Gly Glu Thr Thr Tyr Asp His Lys Asn Thr Phe Thr 
1380 1385 1390 

Leu Ser Cys Asp Gly Ser Leu Arg His Lys Phe Leu Asp Ser Asn He 
1395 1400 1405 

Lys Phe Ser His Val Glu Lys Leu Gly Asn Asn Pro Val Ser Lys Gly 
1410 1415 1420 

Leu Leu He Phe Asp Ala Ser Ser Ser Trp Gly Pro Gin Met Ser Ala 
1425 1430 1435 1440 

Ser Val His Leu Asp Ser Lys Lys Lys Gin His Leu Phe Val Lys Glu 
1445 1450 1455 

Val Lys He Asp Gly Gin Phe Arg Val Ser Ser Phe Tyr Ala Lys Gly 
1460 1465 1470 

Thr Tyr Gly Leu Ser Cys Gin Arg Asp Pro Asn Thr Gly Arg Leu Asn 
1475 1480 1485 

Gly Glu Ser Asn Leu Arg Phe Asn Ser Ser Tyr Leu Gin Gly Thr Asn 
1490 1495 1500 

Gin He Thr Gly Arg Tyr Glu Asp Gly Thr Leu Ser Leu Thr Ser Thr 
1505 1510 1515 1520 

Ser Asp Leu Gin Ser Gly He He Lys Asn Thr Ala Ser Leu Lys Tyr 



Glu Asn Tyr Glu Leu Thr Leu Lys Ser Asp Thr Asn Gly Lys Tyr Lys 
1540 1545 1550 

Asn Phe Ala Thr Ser Asn Lys Met Asp Met Thr Phe Ser Lys Gin Asn 
1555 1560 1565 

Ala Leu Leu Arg Ser Glu Tyr Gin Ala Asp Tyr Glu Ser Leu Arg Phe 
1570 1575 1580 

Phe Ser Leu Leu Ser Gly Ser Leu Asn Ser His Gly Leu Glu Leu Asn 
1585 1590 1595 160( 

Ala Asp He Leu Gly Thr Asp Lys He Asn Ser Gly Ala His Lys Ala 
1605 1610 1615 

Thr Leu Arg He Gly Gin Asp Gly He Ser Thr Ser Ala Thr Thr Asn 
1620 1625 1630 

Leu Lys Cys Ser Leu Leu Val Leu Glu Asn Glu Leu Asn Ala Glu Leu 



1525 



1530 



1535 



1635 



1640 



1645 
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Gly Leu Ser Gly Ala Ser Met Lys 
1650 1655 

Glu His Asn Ala Lys Phe Ser 
1665 1670 

Leu Ser Leu Gly Ser Ala Tyr Gin 
1685 

Lys Asn lie Phe Asn Phe Lys Val 
1700 

Asn Asp Met Met Gly Ser Tyr 
1715 

Ser Leu Asn lie Ala Gly Leu Ser 
1730 1735 

Asn lie Tyr Ser Ser Asp Lys Phe 
1745 1750 

Leu Gin Pro Tyr Ser Leu Val Thr 
1765 

Asn Ala Leu Asp Leu Thr Asn Asn 
1780 
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Leu Thr Thr Asn Gly Arg Phe Arg 
1660 



Ala Met lie Leu Gly Val Asp Ser 
1690 1695 

Ser Gin Glu Gly Leu Lys Leu Ser 
1705 1710 

Thr Asn 



Leu Asp Phe Ser Ser Lys Leu Asp 
1740 

Tyr Lys Gin Thr Val Asn Leu Gin 
1755 1760 

Thr Leu Asn Ser Asp Leu Lys Tyr 
1770 1775 

Gly Lys Leu Arg Leu Glu Pro Leu 
1785 1790 



Leu Asp Gly Lys Ala Ala Leu Thr Glu 
1675 1680 



Ala Glu Met Lys Phe Asp His 
1720 1725 



Lys Leu His Val Ala Gly Asn Leu Lys Gly Ala Tyr Gin Asn Asn Glu 
1795 1800 1805 

lie Lys His He Tyr Ala He Ser Ser Ala Ala Leu Ser Ala Ser Tyr 
1810 1815 1820 

Lys Ala Asp Thr Val Ala Lys Val Gin Gly Val Glu Phe Ser His Arg 
1825 1830 1835 1840 

Leu Asn Thr Asp He Ala Gly Leu Ala Ser Ala He Asp Met Ser Thr 
1845 1850 1855 

Asn Tyr Asn Ser Asp Ser Leu His Phe Ser Asn Val Phe Arg Ser Val 
1860 1865 1870 

Met Ala Pro Phe Thr Met Thr He Asp Ala His Thr Asn Gly Asn Gly 
1875 1880 1885 

Lys Leu Ala Leu Trp Gly Glu His Thr Gly Gin Leu Tyr Ser Lys Phe 
1890 1895 1900 

Leu Leu Lys Ala Glu Pro Leu Ala Phe Thr Phe Ser His Asp Tyr Lys 
1905 1910 1915 1920 

Gly Ser Thr Ser His His Leu Val Ser Arg Lys Ser He Ser Ala Ala 
1925 1930 1935 
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Leu Glu His Lys Val Ser Ala Leu Leu Thr Pro Ala Glu Gin Thr Gly 
1940 1945 1950 

Thr Trp Lys Leu Lys Thr Gin Phe Asn Asn Asn Glu Tyr Ser Gin Asp 
1955 1960 1965 

Leu Asp Ala Tyr Asn Thr Lys Asp Lys lie Gly Val Glu Leu Thr Gly 
1970 1975 1980 

Arg Thr Leu Ala Asp Leu Thr Leu Leu Asp Ser Pro He Lys Val Pro 
1985 1990 1995 2000 

Leu Leu Leu Ser Glu Pro He Asn lie He Asp Ala Leu Glu Met Arg 
2005 2010 2015 

Asp Ala Val Glu Lys Pro Gin Glu Phe Thr He Val Ala Phe Val Lys 
2020 2025 2030 

Tyr Asp Lys Asn Gin Asp Val His Ser He Asn Leu Pro Phe Phe Glu 
2035 2040 2045 

Thr Leu Gin Glu Tyr Phe Glu Arg Asn Arg Gin Thr He He Val Val 
2050 2055 2060 

Val Glu Asn Val Gin Arg Asn Leu Lys His He Asn He Asp Gin Phe 
2065 2070 2075 2080 

Val Arg Lys Tyr Arg Ala Ala Leu Gly Lys Leu Pro Gin Gin Ala Asn 
2085 2090 2095 

Asp Tyr Leu Asn Ser Phe Asn Trp Glu Arg Gin Val Ser His Ala Lys 
2100 2105 2110 

Glu Lys Leu Thr Ala Leu Thr Lys Lys Tyr Arg He Thr Glu Asn Asp 
2115 2120 2125 

He Gin He Ala Leu Asp Asp Ala Lys He Asn Phe Asn Glu Lys Leu 
2130 2135 2140 

Ser Gin Leu Gin Thr Tyr Met He Gin Phe Asp Gin Tyr He Lys Asp 
2145 2150 2155 2160 

Ser Tyr Asp Leu His Asp Leu Lys He Ala He Ala Asn He He Asp 
2165 2170 2175 

Glu He He Glu Lys Leu Lys Ser Leu Asp Glu His Tyr His He Arg 
2180 2185 2190 

Val Asn Leu Val Lys Thr He His Asp Leu His Leu Phe He Glu Asn 
2195 2200 2205 

He Asp Phe Asn Lys Ser Gly Ser Ser Thr Ala Ser Trp He Gin Asn 
2210 2215 2220 
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Val Asp Thr Lys Tyr Gin lie Arg He Gin He Gin Glu Lys Leu Gin 
2225 2230 2235 2240 

Gin Leu Lys Arg His He Gin Asn He Asp He Gin His Leu Ala Gly 
2245 2250 2255 

Lys Leu Lys Gin His He Glu Ala He Asp Val Arg Val Leu Leu Asp 
2260 2265 2270 

Gin Leu Gly Thr Thr He Ser Phe Glu Arg He Asn Asp Val Leu Glu 
2275 2280 2285 

His Val Lys His Phe Val He Asn Leu He Gly Asp Phe Glu Val Ala 
2290 2295 2300 

Glu Lys He Asn Ala Phe Arg Ala Lys Val His Glu Leu He Glu Arg 
2305 2310 2315 2320 

Tyr Glu Val Asp Gin Gin He Gin Val Leu Met Asp Lys Leu Val Glu 
2325 2330 2335 

Leu Thr His Gin Tyr Lys Leu Lys Glu Thr He Gin Lys Leu Ser Asn 
2340 2345 2350 

Val Leu Gin Gin Val Lys He Lys Asp Tyr Phe Glu Lys Leu Val Gly 
2355 2360 2365 

Phe He Asp Asp Ala Val Lys Lys Leu Asn Glu Leu Ser Phe Lys Thr 
2370 2375 2380 

Phe He Glu Asp Val Asn Lys Phe Leu Asp Met Leu He Lys Lys Leu 
2385 2390 2395 2400 

Lys Ser Phe Asp Tyr His Gin Phe Val Asp Glu Thr Asn Asp Lys He 
2405 2410 2415 

Arg Glu Val Thr Gin Arg Leu Asn Gly Glu He Gin Ala Leu Glu Leu 
2420 2425 2430 

Pro Gin Lys Ala Glu Ala Leu Lys Leu Phe Leu Glu Glu Thr Lys Ala 
2435 2440 2445 

Thr Val Ala Val Tyr Leu Glu Ser Leu Gin Asp Thr Lys He Thr Leu 
2450 2455 2460 

He He Asn Trp Leu Gin Glu Ala Leu Ser Ser Ala Ser Leu Ala His 
2465 2470 2475 2480 

Met Lys Ala Lys Phe Arg Glu Thr Leu Glu Asp Thr Arg Asp Arg Met 
2485 2490 2495 

Tyr Asp Met Asp He Gin Gin Glu Leu Gin Arg Tyr Leu Ser Leu Val 
2500 2505 2510 
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Gly Gin Val Tyr Ser Thr Leu Val Thr Tyr He Ser Asp Trp Trp Thr 
2515 2520 2525 

Leu Ala Ala Lys Asn Leu Thr Asp Phe Ala Glu Gin Tyr Ser He Gin 
2530 2535 2540 

Asp Trp Ala Lys Arg Met Lys Ala Leu Val Glu Gin Gly Phe Thr Val 
2545 2550 2555 2560 

Pro Glu He Lys Thr He Leu Gly Thr Met Pro Ala Phe Glu Val Ser 
2565 2570 2575 

Leu Gin Ala Leu Gin Lys Ala Thr Phe Gin Thr Pro Asp Phe He Val 
2580 2585 2590 

Pro Leu Thr Asp Leu Arg He Pro Ser Val Gin He Asn Phe Lys Asp 
2595 2600 2605 

Leu Lys Asn He Lys He Pro Ser Arg Phe Ser Thr Pro Glu Phe Thr 
2610 2615 2620 

He Leu Asn Thr Phe His He Pro Ser Phe Thr He Asp Phe Val Glu 
2625 2630 2635 2640 

Met Lys Val Lys He He Arg Thr He Asp Gin Met Gin Asn Ser Glu 
2645 2650 2655 

Leu Gin Trp Pro Val Pro Asp He Tyr Leu Arg Asp Leu Lys Val Glu 
2660 2665 2670 

Asp He Pro Leu Ala Arg He Thr Leu Pro Asp Phe Arg Leu Pro Glu 
2675 2680 2685 

He Ala He Pro Glu Phe He He Pro Thr Leu Asn Leu Asn Asp Phe 
2690 2695 2700 

Gin Val Pro Asp Leu His He Pro Glu Phe Gin Leu Pro His He Ser 
2705 2710 2715 2720 

His Thr He Glu Val Pro Thr Phe Gly Lys Leu Tyr Ser He Leu Lys 
2725 2730 2735 

He Gin Ser Pro Leu Phe Thr Leu Asp Ala Asn Ala Asp He Gly Asn 
2740 2745 2750 

Gly Thr Thr Ser Ala Asn Glu Ala Gly He Ala Ala Ser He Thr Ala 
2755 2760 2765 

Lys Gly Glu Ser Lys Leu Glu Val Leu Asn Phe Asp Phe Gin Ala Asn 
2770 2775 2780 

Ala Gin Leu Ser Asn Pro Lys He Asn Pro Leu Ala Leu Lys Glu Ser 
2785 2790 2795 2800 
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Val Lys Phe Ser Ser Lys Tyr Leu Arg Thr Glu His Gly Ser Glu Met 
2805 2810 2815 

Leu Phe Phe Gly Asn Ala lie Glu Gly Lys Ser Asn Thr Val Ala Ser 
2620 2825 2830 

Leu His Thr Glu Lys Asn Thr Leu Glu Leu Ser Asn Gly Val lie Val 
2835 2840 2845 

Lys He Asn Asn Gin Leu Thr Leu Asp Ser Asn Thr Lys Tyr Phe His 
2850 2855 2860 

Lys Leu Asn He Pro Lys Leu Asp Phe Ser Ser Gin Ala Asp Leu Arg 
2865 2870 2875 2880 

Asn Glu He Lys Thr Leu Leu Lys Ala Gly His He Ala Trp Thr Ser 
2885 2890 2895 

Ser Gly Lys Gly Ser Trp Lys Trp Ala Cys Pro Arg Phe Ser Asp Glu 
2900 2905 2910 

Gly Thr His Glu Ser Gin He Ser Phe Thr He Glu Gly Pro Leu Thr 
2915 2920 2925 

Ser Phe Gly Leu Ser Asn Lys He Asn Ser Lys His Leu Arg Val Asn 
2930 2935 2940 

Gin Asn Leu Val Tyr Glu Ser Gly Ser Leu Asn Phe Ser Lys Leu Glu 
2945 2950 2955 2960 

He Gin Ser Gin Val Asp Ser Gin His Val Gly His Ser Val Leu Thr 
2965 2970 2975 

Ala Lys Gly Met Ala Leu Phe Gly Glu Gly Lys Ala Glu Phe Thr Gly 
2980 2985 2990 

Arg His Asp Ala His Leu Asn Gly Lys Val He Gly Thr Leu Lys Asn 
2995 3000 3005 

Ser Leu Phe Phe Ser Ala Gin Pro Phe Glu He Thr Ala Ser Thr Asn 
3010 3015 3020 

Asn Glu Gly Asn Leu Lys Val Arg Phe Pro Leu Arg Leu Thr Gly Lys 
3025 3030 3035 3040 

He Asp Phe Leu Asn Asn Tyr Ala Leu Phe Leu Ser Pro Ser Ala Gin 
3045 3050 3055 

Gin Ala Ser Trp Gin Val Ser Ala Arg Phe Asn Gin Tyr Lys Tyr Asn 
3060 3065 3070 

Gin Asn Phe Ser Ala Gly Asn Asn Glu Asn He Met Glu Ala His Val 
3075 3080 3085 
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Gly He Asn Gly Glu Ala Asn Leu Asp Phe Leu Asn He Pro Leu Thr 
3090 3095 3100 

He Pro Glu Met Arg Leu Pro Tyr Thr He He Thr Thr Pro Pro Leu 
3105 3110 3115 3120 

Lys Asp Phe Ser Leu Trp Glu Lys Thr Gly Leu Lys Glu Phe Leu Lys 
3125 3130 3135 

Thr Thr Lys Gin Ser Phe Asp Leu Ser Val Lys Ala Gin Tyr Lys Lys 
3140 3145 3150 

Asn Lys His Arg His Ser He Thr Asn Pro Leu Ala Val Leu Cys Glu 
3155 3160 3165 

Phe He Ser Gin Ser He Lys Ser Phe Asp Arg His Phe Glu Lys Asn 
3170 3175 3180 

Arg Asn Asn Ala Leu Asp Phe Val Thr Lys Ser Tyr Asn Glu Thr Lys 
3185 3190 3195 3200 

He Lys Phe Asp Lys Tyr Lys Ala Glu Lys Ser His Asp Glu Leu Pro 
3205 3210 3215 

Arg Thr Phe Gin He Pro Gly Tyr Thr Val Pro Val Val Asn Val Glu 
3220 3225 3230 

Val Ser Pro Phe Thr He Glu Met Ser Ala Phe Gly Tyr Val Phe Pro 
3235 3240 3245 

Lys Ala Val Ser Met Pro Ser Phe Ser He Leu Gly Ser Asp Val Arg 
3250 3255 3260 

Val Pro Ser Tyr Thr Leu He Leu Pro Ser Leu Glu Leu Pro Val Leu 
3265 3270 3275 3280 

His Val Pro Arg Asn Leu Lys Leu Ser Leu Pro His Phe Lys Glu Leu 
3285 3290 3295 

Cys Thr He Ser His He Phe He Pro Ala Met Gly Asn He Thr Tyr 
3300 3305 3310 

Asp Phe Ser Phe Lys Ser Ser Val He Thr Leu Asn Thr Asn Ala Glu 
3315 3320 3325 

Leu Phe Asn Gin Ser Asp He Val Ala His Leu Leu Ser Ser Ser Ser 
3330 3335 3340 

Ser Val He Asp Ala Leu Gin Tyr Lys Leu Glu Gly Thr Thr Arg Leu 
3345 3350 3355 3360 

Thr Arg Lys Arg Gly Leu Lys Leu Ala Thr Ala Leu Ser Leu Ser Asn 
3365 3370 3375 
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Lys Phe Val Glu Gly Ser His Asn Ser Thr Val Ser Leu Thr Thr Lys 
3380 3385 3390 

Asn Met Glu Val Ser Val Ala Lys Thr Thr Lys Ala Glu He Pro He 
3395 3400 3405 

Leu Arg Met Asn Phe Lys Gin Glu Leu Asn Gly Asn Thr Lys Ser Lys 
3410 3415 3420 

Pro Thr Val Ser Ser Ser Met Glu Phe Lys Tyr Asp Phe Asn Ser Ser 
3425 3430 3435 3440 

Met Leu Tyr Ser Thr Ala Lys Gly Ala Val Asp His Lys Leu Ser Leu 
3445 3450 3455 

Glu Ser Leu Thr Ser Tyr Phe Ser He Glu Ser Ser Thr Lys Gly Asp 
3460 3465 3470 

Val Lys Gly Ser Val Leu Ser Arg Glu Tyr Ser Gly Thr He Ala Ser 
3475 3480 3485 

Glu Ala Asn Thr Tyr Leu Asn Ser Lys Ser Thr Arg Ser Ser Val Lys 
3490 3495 3500 

Leu Gin Gly Thr Ser Lys He Asp Asp He Trp Asn Leu Glu Val Lys 
3505 3510 3515 3520 

Glu Asn Phe Ala Gly Glu Ala Thr Leu Gin Arg He Tyr Ser Leu Trp 
3525 3530 3535 

Glu His Ser Thr Lys Asn His Leu Gin Leu Glu Gly Leu Phe Phe Thr 
3540 3545 3550 

Asn Gly Glu His Thr Ser Lys Ala Thr Leu Glu Leu Ser Pro Trp Gin 
3555 3560 3565 

Met Ser Ala Leu Val Gin Val His Ala Ser Gin Pro Ser Ser Phe His 
3570 3575 3580 

Asp Phe Pro Asp Leu Gly Gin Glu Val Ala Leu Asn Ala Asn Thr Lys 
3585 3590 3595 3600 

Asn Gin Lys He Arg Trp Lys Asn Glu Val Arg He His Ser Gly Ser 
3605 3610 3615 

Phe Gin Ser Gin Val Glu Leu Ser Asn Asp Gin Glu Lys Ala His Leu 
3620 3625 3630 

Asp He Ala Gly Ser Leu Glu Gly His Leu Arg Phe Leu Lys Asn He 
3635 3640 3645 

He Leu Pro Val Tyr Asp Lys Ser Leu Trp Asp Phe Leu Lys Leu Asp 
3650 3655 3660 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 



PCTAJS98/11927 



-105- 

Val Thr Thr Ser He Gly Arg Arg Gin His Leu Arg Val Ser Thr Ala 
3665 3670 3675 3680 

Phe Val Tyr Thr Lys Asn Pro Asn Gly Tyr Ser Phe Ser He Pro Val 
3685 3690 3695 

Lys Val Leu Ala Asp Lys Phe He Thr Pro Gly Leu Lys Leu Asn Asp 
3700 3705 3710 

Leu Asn Ser Val Leu Val Met Pro Thr Phe His Val Pro Phe Thr Asp 
3715 3720 3725 

Leu Gin Val Pro Ser Cys Lys Leu Asp Phe Arg Glu He Gin He Tyr 
3730 3735 3740 

Lys Lys Leu Arg Thr Ser Ser Phe Ala Leu Asn Leu Pro Thr Leu Pro 
3745 3750 3755 3760 

Glu Val Lys Phe Pro Glu Val Asp Val Leu Thr Lys Tyr Ser Gin Pro 
3765 3770 3775 

Glu Asp Ser Leu He Pro Phe Phe Glu He Thr Val Pro Glu Ser Gin 
3780 3785 3790 

Leu Thr Val Ser Gin Phe Thr Leu Pro Lys Ser Val Ser Asp Gly He 
3795 3800 3805 

Ala Ala Leu Asp Leu Asn Ala Val Ala Asn Lys He Ala Asp Phe Glu 
3810 3815 3820 

Leu Pro Thr He He Val Pro Glu Gin Thr He Glu He Pro Ser He 
3825 3830 3835 3840 

Lys Phe Ser Val Pro Ala Gly He Val He Pro Ser Phe Gin Ala Leu 
3845 3850 3855 

Thr Ala Arg Phe Glu Val Asp Ser Pro Val Tyr Asn Ala Thr Trp Ser 
3860 3865 3870 

Ala Ser Leu Lys Asn Lys Ala Asp Tyr Val Glu Thr Val Leu Asp Ser 
3875 3880 3885 

Thr Cys Ser Ser Thr Val Gin Phe Leu Glu Tyr Glu Leu Asn Val Leu 
3890 3895 3900 

Gly Thr His Lys He Glu Asp Gly Thr Leu Ala Ser Lys Thr Lys Gly 
3905 3910 3915 3920 

Thr Leu Ala His Arg Asp Phe Ser Ala Glu Tyr Glu Glu Asp Gly Lys 
3925 3930 3935 

Phe Glu Gly Leu Gin Glu Trp Glu Gly Lys Ala His Leu Asn He Lys 
3940 3945 3950 
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Ser Pro Ala Phe Thr Asp Leu His Leu Arg Tyr Gin Lys Asp Lys Lys 
3955 3960 3965 

Gly lie Ser Thr Ser Ala Ala Ser Pro Ala Val Gly Thr Val Gly Met 
3970 3975 3980 

Asp Met Asp Glu Asp Asp Asp Phe Ser Lys Trp Asn Phe Tyr Tyr Ser 
3985 3990 3995 4000 

Pro Gin Ser Ser Pro Asp Lys Lys Leu Thr He Phe Lys Thr Glu Leu 
4005 4010 4015 

Arg Val Arg Glu Ser Asp Glu Glu Thr Gin He Lys Val Asn Trp Glu 
4020 4025 4030 

Glu Glu Ala Ala Ser Gly Leu Leu Thr Ser Leu Lys Asp Asn Val Pro 
4035 4040 4045 

Lys Ala Thr Gly Val Leu Tyr Asp Tyr Val Asn Lys Tyr His Trp Glu 
4050 4055 4060 

His Thr Gly Leu Thr Leu Arg Glu Val Ser Ser Lys Leu Arg Arg Asn 
4065 4070 4075 4080 

Leu Gin Asn Asn Ala Glu Trp Val Tyr Gin Gly Ala Xle Arg Gin He 
4085 4090 4095 

Asp Asp He Asp Val Arg Phe Gin Lys Ala Ala Ser Gly Thr Thr Gly 
4100 4105 4110 

Thr Tyr Gin Glu Trp Lys Asp Lys Ala Gin Asn Leu Tyr Gin Glu Leu 
4115 4120 4125 

Leu Thr Gin Glu Gly Gin Ala Ser Phe Gin Gly Leu Lys Asp Asn Val 
4130 4135 4140 



Phe Asp Gly Leu Val Arg Val Thr Gin Lys Phe His Met Lys Val Lys 
4145 4150 4155 4160 

His Leu He Asp Ser Leu He Asp Phe Leu Asn Phe Pro Arg Phe Gin 
4165 . 4170 4175 



Phe Pro Gly Lys Pro Gly He Tyr Thr Arg Glu Glu Leu Cys Thr Met 
4180 4185 4190 

Phe He Arg Glu Val Gly Thr Val Leu Ser Gin Val Tyr Ser Lys Val 
4195 4200 4205 

His Asn Gly Ser Glu He Leu Phe Ser Tyr Phe Gin Asp Leu Val He 
4210 4215 4220 

Thr Leu Pro Phe Glu Leu Arg Lys His Lys Leu He Asp Val He Ser 
4225 4230 4235 4240 
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Met Tyr Arg Glu Leu Leu Lys Asp Leu Ser Lys Glu Ala Gin Glu Val 
4245 4250 4255 

Phe Lys Ala lie Gin Ser Leu Lys Thr Thr Glu Val Leu Arg Asn Leu 
4260 4265 4270 

Gin Asp Leu Leu Gin Phe He Phe Gin Leu He Glu Asp Asn He Lys 
4275 4280 4285 

Gin Leu Lys Glu Met Lys Phe Thr Tyr Leu He Asn Tyr He Gin Asp 
4290 4295 4300 

Glu He Asn Thr He Phe Asn Asp Tyr He Pro Tyr Val Phe Lys Leu 
4305 4310 4315 4320 

Leu Lys Glu Asn Leu Cys Leu Asn Leu His Lys Phe Asn Glu Phe He 
4325 4330 4335 

Gin Asn Glu Leu Gin Glu Ala Ser Gin Glu Leu Gin Gin He His Gin 
4340 4345 4350 

Tyr He Met Ala Leu Arg Glu Glu Tyr Phe Asp Pro Ser He Val Gly 
4355 4360 4365 

Trp Thr Val Lys Tyr Tyr Glu Leu Glu Glu Lys He Val Ser Leu He 
4370 4375 4380 

Lys Asn Leu Leu Val Ala Leu Lys Asp Phe His Ser Glu Tyr He Val 
4385 4390 4395 4400 

Ser Ala Ser Asn Phe Thr Ser Gin Leu Ser Ser Gin Val Glu Gin Phe 
4405 4410 4415 

Leu His Arg Asn He Gin Glu Tyr Leu Ser He Leu Thr Asp Pro Asp 
4420 4425 4430 



Gly Lys Gly Lys Glu Lys He Ala Glu Leu Ser Ala Thr Ala Gin Glu 
4435 4440 4445 

He He Lys Ser Gin Ala He Ala Thr Lys Lys He He Ser Asp Tyr 
4450 4455 4460 



His Gin Gin Phe Arg Tyr Lys Leu Gin Asp Phe Ser Asp Gin Leu Ser 
4465 4470 4475 4480 

Asp Tyr Tyr Glu Lys Phe He Ala Glu Ser Lys Arg Leu He Asp Leu 
4485 4490 4495 

Ser He Gin Asn Tyr His Thr Phe Leu He Tyr He Thr Glu Leu Leu 
4500 4505 4510 

Lys Lys Leu Gin Ser Thr Thr Val Met Asn Pro Tyr Met Lys Leu Ala 
4515 4520 4525 
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Pro Gly Glu Leu Thr lie lie Leu 
4530 4535 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CH7VRACTERISTICS : 

(A) LENGTH: 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Pro Xaa Pro 
1 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Lys Tyr Thr Tyr Asn Tyr Glu Ala Glu Ser Ser Ser Gly Val Pro Gly 
15 10 15 

Thr Ala Asp Ser Arg Ser Ala Thr Arg lie Asn Cys Lys Val Glu Leu 
20 25 30 

Glu Val Pro Gin Leu Cys Ser Phe He Leu Lys Thr Ser Gin 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4: 

Ala Tyr Asp Phe Asn Tyr Pro He Lys Lys Asp Ser Ser Ser Gin Leu 
15 10 15 

Leu Ser Val Gin Gin Gly Glu Thr He Tyr He Leu Asn Lys Asn Ser 
20 25 30 
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Ser Gly Trp Trp Asp Gly Leu Val lie Asp Asp Ser Asn 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Val Tyr Gly Phe Asn Pro Glu Gly Lys Ala Leu Leu Lys Lys Thr Lys 
15 10 15 

Asn Ser Glu Glu Phe Ala Ala Ala Met Ser Arg Tyr Glu Leu Lys Leu 
20 25 30 

Ala lie Pro Glu Gly Lys Gin Val Phe Leu Tyr Pro Glu 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Leu Tyr Asp Phe Val Ala Ser Gly Asp Asn Thr Leu Ser He Thr Lys 
15 10 15 

Gly Glu Lys Leu Arg Val Leu Gly Tyr Asn His Tyr Asn Gly Glu Trp 
20 25 30 

Cys Glu Ala Gin Thr Lys Asn Gly Gin Gly Trp Val Pro Ser Asn 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
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Phe Leu Pro Phe Ser Tyr Asn Asn Lys Tyr Gly Met Val Ala Gin Val 
15 10 15 

Thr Gin Thr Leu Lys Leu Glu Asp Thr Pro Lys He Asn Ser Arg Phe 
20 25 30 

Phe Gly Glu Gly Thr Lys Lys Met Gly Leu Ala Phe 
35 40 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Leu Phe Asp Tyr Lys Ala Gin Arg Glu Asp Glu Leu Thr Phe Thr Lys 
15 10 15 

Ser Ala He He Gin Asn Val Glu Lys Gin Glu Gly Gly Trp Trp Arg 
20 25 30 

Gly Asp Tyr Gly Gly Lys Lys Gin Leu Trp Phe 
35 40 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Phe Leu Pro Phe Ser Tyr Asn Asn Lys Tyr Gly Met Val Ala Gin Val 
15 10 15 

Thr Gin Thr Leu Lys Leu Glu Asp Thr Pro Lys He Asn Ser Arg Phe 
20 25 30 

Phe Gly Glu Gly Thr Lys Lys Met Gly Leu Ala Phe Glu 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Leu His Ser Tyr Glu Pro Ser His Asp Gly Asp Leu Gly Phe Glu Lys 
15 10 15 

Gly Glu Gin Leu Arg lie Leu Glu Gin Ser Gly Glu Trp Trp Lys Ala 
20 25 30 

Gin Ser Leu Thr Thr Gly Gin Glu Gly Phe He Pro Phe Asn 
35 40 45 



(2) INFORMATION FOR SEQ ID NO; 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Tyr Thr Tyr Leu He Leu Arg Val He Gly Asn Met Gly Gin Thr Met 
15 10 15 

Glu Gin Leu Thr Pro Glu Leu Lys Ser Ser He Leu Lys Cys Val Gin 
20 25 30 

Ser Thr Lys Pro Ser Leu Met He Gin Lys Ala Ala He Gin Ala Leu 
35 40 45 

Arg Lys Met Glu Pro Lys Asp Lys Asp Gin Glu Val Leu Leu 
50 55 60 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Val Val Ala Leu Phe Asp Tyr Ala Ala Val Asn Asp Arg Asp Leu Gin 
15 10 15 

Val Leu Lys Gly Glu Lys Leu Gin Val Leu Arg Ser Thr Gly Asp Trp 
20 25 30 
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Trp Leu Ala Arg Ser Leu Val Thr Gly Arg Glu Gly Tyr Val Pro Ser 
35 40 45 

Asn Phe Val Ala Pro 
50 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Ala Phe Gly Phe Ala Ser Ala Asp Leu lie Glu lie Gly Leu Glu Gly 
15 10 15 

Lys Gly Phe Glu Pro Thr Leu Glu Ala Leu Phe Gly Lys Gin Gly Phe 
20 25 30 

Phe Pro Asp Ser Val Asn Lys Ala Leu Tyr Trp Val Asn Gly Gin Val 
35 40 45 

Pro Asp 
50 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Leu Tyr Asp Phe Ala Ala Glu Asn Pro Asp Glu Leu Thr Phe Asn Glu 
15 10 15 

Gly Ala Val Val Thr Val lie Asn Lys Ser Asn Pro Asp Trp Trp Glu 
20 25 30 

Gly Glu Leu Asn Gly Gin Arg Gly Val Phe Pro Ala Ser Tyr Val Glu 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CH/UIACTERISTICS : 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Phe Gly Tyr Thr Lys Asp Asp Lys His Glu Gin Asp Met Val Asn Gly 
15 10 15 

lie Met Leu Ser Val Glu Lys Leu He Lys Asp Leu Lys Ser Lys Glu 
20 25 30 

Val Pro Glu Ala Arg Ala Tyr Leu Arg He Leu Gly Glu Glu 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Tyr Asp Tyr Lys Lys Glu Glu Glu Asp He Asp Leu His Leu Gly Asp 
15 10 15 

He Leu Thr Val Asn Lys Gly Ser Leu Val Ala Leu Gly Phe Ser Asp 
20 25 30 

Gly Gin Glu Ala Lys Pro Glu Glu He Gly Trp Leu Asn Gly Tyr Asn 
35 40 45 

Glu 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Phe Asp Tyr His Gin Phe Val Asp Glu Thr Asn Asp Lys He Arg Glu 
1 5 10 15 

Val Thr Gin Arg Leu Asn Gly Glu He Gin Ala Leu Glu Leu Pro Gin 

20 25 30 

Lys Ala Glu Ala Leu Lys Leu Phe Leu Glu Glu Thr Lys Ala Thr Val 
35 40 45 
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Ala Val Tyr Leu 
50 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Tyr Asp Tyr Gin Glu Lys Ser Pro Arg Glu Val Thr Met Lys Lys Gly 
15 10 15 

Asp He Leu Thr Leu Leu Asn Ser Thr Asn Lys Asp Trp Trp Lys Val 
20 25 30 

Glu Val Asn Asp Arg Gin Gly Phe Val Pro Ala Ala Tyr Val 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Tyr Asp Met Asp He Gin Gin Glu Leu Gin Arg Tyr Leu Ser Leu Val 
15 10 15 

Gly Gin Val Tyr Ser Thr Leu Val Thr Tyr He Ser Asp Trp Trp Thr 
20 25 30 

Leu Ala Ala Lys Asn Leu Thr Asp Phe Ala Glu Gin Tyr Ser He Gin 
35 40 45 

Asp Trp Ala 
50 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acidS 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
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Phe Asp Tyr Lys Ala Gin Arg Glu Asp Glu Leu Thr Phe Thr Lys Ser 
15 10 15 

Ala lie lie Gin Asn Val Glu Lys Gin Asp Gly Gly Trp Trp Arg Gly 
20 25 30 

Asp Tyr Gly Gly Lys Lys Gin Leu Trp Phe Pro Ser Asn Tyr Val Glu 
35 40 45 

Glu Met lie 
50 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Tyr Asp Met Asp He Gin Gin Glu Leu Gin Arg Tyr Leu Ser Leu Val 
15 10 15 

Gly Gin Val Tyr Ser Thr Leu Val Thr Tyr He Ser Asp Trp Trp Thr 
20 25 30 

Leu Ala Ala Lys Asn Leu Thr Asp Phe Ala Glu Gin Tyr Ser He Gin 
35 40 45 

Asp Trp Ala Lys Arg Met Lys 
50 55 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

He Gin Asp Tyr Glu Pro Arg Leu Thr Asp Glu He Arg He Ser Leu 
15 10 15 

Gly Glu Lys Val Lys He Leu Ala Thr His Thr Asp Gly Trp Cys Leu 
20 25 30 

Val Glu Lys Cys Asn Thr Arg Lys Gly Thr He His Val Ser Val Asp 
35 40 45 
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Asp Lys Arg Tyr Leu 
50 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Tyr Asp Tyr Glu Ala Arg Thr Glu Asp Asp Leu Thr Phe Thr Lys Gly 
15 10 15 

Glu Lys Phe His He Leu Asn Asn Thr Glu Gly Asp Trp Trp Glu Ala 
20 25 30 

Arg Ser Leu Ser Ser Gly Lys Thr Gly Cys He Pro Ser Asn Tyr Val 
35 40 45 

Ala 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ I] 

Thr Tyr Asp Phe Ser Phe Lys Ser 
1 5 

Ala Glu Leu Phe Asn Gin Ser Asp 
20 

Ser Ser Ser Val He Asp Ala Leu 
35 40 



) NO: 24: 

Ser Val He Thr Leu Asn Thr Asn 
10 15 

He Val Ala His Leu Leu Ser Ser 
25 30 

Gin Tyr Lys Leu Glu 
45 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
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Asp Phe Asn Tyr Pro He Lys Lys Asp Ser Ser Ser Gin Leu Leu Ser 
15 10 15 

Val Gin Gin Gly Glu Thr He Tyr He Leu Asn Lys Asn Ser Ser Gly 
20 25 30 

Trp Trp Asp Gly Leu Val He Asp Asp Ser Asn Gly Lys Val Asn 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 26: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 26: 

Lys Tyr Asp Phe Asn Ser Ser Met Leu Tyr Ser Thr Ala Lys Gly Ala 
15 10 15 

Val Asp His Lys Leu Ser Leu Glu Ser Leu Thr Ser Tyr Phe Ser He 
20 25 30 

Glu Ser Ser Thr Lys Gly Asp Val Lys Gly Ser Val Leu Ser Arg Glu 
35 40 45 

Tyr 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Glu Pro Tyr Val Ala He Lys Ala Tyr Thr Ala Val Glu Gly Asp Glu 
15 10 15 

Val Ser Leu Leu Glu Gly Glu Ala Val Glu Val He His Lys Leu Leu 
20 25 30 

Asp Gly Trp Trp Val He Arg Lys Asp Asp Val Thr Gly Tyr Phe Pro 
35 40 45 

Ser Met Tyr Leu 
50 
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(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Leu Trp Asp Phe Leu Lys Leu Asp Val Thr Thr Ser He Gly Arg Arg 
1 5 10 15 

Gin His Leu Arg Val Ser Thr Ala Phe Val Tyr Thr Lys Asn Pro Asn 
20 25 30 

Gly Tyr Ser Phe Ser He Pro Val Lys Val Leu Ala Asp Lys Phe He 
35 40 45 

Thr Pro Gly Leu Lys Leu 
50 



(2) INFORMATION FOR SEQ ID NO : 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Leu Tyr Asp Phe Lys Ala Glu Lys Ala Asp Glu Leu Thr Thr Tyr Val 
15 10 15 

Gly Glu Asn Leu Phe He Cys Ala His His Asn Cys Glu Trp Phe He 
20 25 30 

Ala Lys Pro He Gly Arg Leu Gly Gly Pro Gly Leu Val Pro Val Gly 
35 40 45 

Phe Val Ser He He Asp He 
50 55 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
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Val Leu Tyr Asp Tyr Val Asn Lys Tyr His Trp Glu His Thr Gly Leu 
15 10 15 

Thr Leu Arg Glu Val Ser Ser Lys Leu Arg Arg Asn Leu Gin Asn Asn 
20 25 30 

Ala Glu Trp Val Tyr Gin Gly Ala He Arg Gin He Asp Asp He 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 31: 

Val Leu Tyr Asp Phe Lys Ala Glu Lys Ala Asp Glu Leu Thr Thr Tyr 
15 10 15 

Val Gly Glu Asn Leu Phe He Cys Ala His His Asn Cys Glu Trp Phe 
20 25 30 

He Ala Lys Pro He Gly Arg Leu 
35 40 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Lys Pro Gly He Tyr Thr Arg Glu Glu Leu Cys Thr Met Phe He Arg 
15 10 15 

Glu Val Gly Thr Val Leu Ser Gin Val Tyr Ser Lys Val His Asn Gly 
20 25 30 

Ser Glu He Leu Phe Ser Tyr Phe Gin Asp Leu 
35 40 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Leu Phe Gly Phe Val Pro Glu Thr Lys Glu Glu Leu Gin Val Met Pro 
15 10 15 

Gly Asn He Val Phe Val Leu Lys Lys Gly Asn Asp Asn Trp Ala Thr 
20 25 30 

Val Met Phe Asn Gly Gin Lys Gly Leu Val Pro Cys Asn Tyr Leu Glu 
35 40 45 

Pro Val Glu Leu 
50 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Gly Lys Pro Gly He Tyr Thr Arg Glu Glu Leu Cys Thr Met Phe He 
15 10 15 

Arg Glu Val Gly Thr Val Leu Ser Gin Val Tyr Ser Lys Val His Asn 
20 25 30 

Gly Ser Glu He Leu Phe Ser Tyr Phe Gin Asp 
35 40 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Ala Lys Phe Asp Tyr Val Ala Gin Gin Glu Gin Glu Leu Asp He Lys 
.1 5 10 15 

Lys Asn Glu Arg Leu Trp Leu Leu Asp Asp Ser Lys Ser Trp Trp Arg 
20 25 30 

Val Arg Asn Ser Met Asn Lys Thr Gly Phe Val Pro Ser Asn Tyr Val 
35 40 45 
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Glu Arg Lys Asn 
50 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) liENQTH: 85 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Trp Tyr His Ala Ser Leu Thr Arg Ala Gin Ala Glu His Met Leu Met 
15 10 15 

Arg Val Pro Arg Asp Gly Ala Phe Leu Val Arg Lys Arg Asn Glu Pro 
20 25 30 

Asn Ser Tyr Ala He Ser Phe Arg Ala Glu Gly Lys He Lys His Cys 
35 40 45 

Arg Val Gin Gin Glu Gly Thr Val Met Leu Gly Asn Ser Glu Phe Asp 
50 55 60 

Ser Leu Val Asp Leu He Ser Tyr Tyr Glu Lys His Pro Leu Tyr Arg 
65 70 75 80 

Lys Met Lys Leu Lys 
85 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 106 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

Phe Phe Gly Glu Gly Thr Lys Lys Met Gly Leu Ala Phe Glu Ser Thr 
15 10 15 

Lys Ser Thr Ser Pro Pro Lys Gin Ala Glu Ala Val Leu Lys Thr Leu 
20 25 30 

Gin Glu Leu Lys Lys Leu Thr He Ser Glu Gin Asn He Gin Arg Ala 
35 40 45 

Asn Leu Phe Asn Lys Leu Val Thr Glu Leu Arg Gly Leu Ser Asp .Glu 
50 55 60 
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Ala Val Thr Ser Leu Leu Pro Gin Leu lie Glu Val Ser Ser Pro lie 
65 70 75 80 

Thr Leu Gin Ala Leu Val Gin Cys Gly Gin Pro Cys Ser Thr His lie 
85 90 95 

Leu Gin Trp Leu Lys Arg Val His Ala Asn 
100 105 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 91 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Trp Phe His Gly Lys He Ser Lys Gin Glu Ala Tyr Asn Leu Leu Met 
15 10 15 

Thr Val Gly Gin Ala Cys Ser Phe Leu Val Arg Pro Ser Asp Asn Thr 
20 25 30 

Pro Gly Asp Tyr Ser Leu Tyr Phe Arg Thr Ser Glu Asn He Gin Arg 
35 40 45 

Phe Lys He Cys Pro Thr Pro Asn Asn Gin Phe Met Met Gly Gly Arg 
50 55 60 

Tyr Tyr Asn Ser Ser He Gly Asp He He Asp His Tyr Arg Lys Glu 
65 70 75 80 

Gin He Val Glu Gly Tyr Tyr Leu Lys Glu Pro 
85 90 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 39: 

He Met Leu Ser Val Glu Lys Leu He Lys Asp Leu Lys Ser Lys Glu 
15 10 15 

Val Pro Glu Ala Arg Ala Tyr Leu Arg He Leu Gly Glu Glu Leu Gly 
20 25 30 
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Phe Ala Ser Leu His Asp Leu Gin Leu Leu Gly Lys Leu Leu Leu Met 
35 40 45 

Gly Ala Arg Thr Leu Gin Gly lie Pro Gin Met lie Gly Glu Val He 
50 55 60 

Arg Lys Gly Ser Lys Asn Asp Phe Phe Leu His Tyr He Phe Met Glu 
65 70 75 80 

Asn Ala Phe Glu Leu Pro Thr Gly Ala Gly Leu Gin Leu 
85 90 



SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 89 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) 

Trp Phe His Gly Lys He Ser Lys 
1 5 

Thr Val Gly Gin Ala Cys Ser Phe 
20 

Pro Gly Asp Tyr Ser Leu Tyr Phe 
35 40 

Phe Lys He Cys Pro Thr Pro Asn 
50 55 

Tyr Tyr Asn Ser Ser He Gly Asp 
65 70 

Gin He Val Glu Gly Tyr Tyr Leu 
85 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



Gin Glu Ala Tyr Asn Leu Leu Met 
10 15 

Leu Val Arg Pro Ser Asp Asn Thr 
25 30 

Arg Thr Ser Glu Asn He Gin Arg 
45 

Asn Gin Phe Met Met Gly Gly Arg 
60 

He He Asp His Tyr Arg Lys Glu 
75 80 

Lys 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

Tyr Phe His Lys Leu Asn He Pro Lys Leu Asp Phe Ser Ser Gin Ala 
15 10 15 
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Asp Leu Arg Asn Glu lie Lys Thr Leu Leu Lys Ala Gly His lie Ala 
20 25 30 

Trp Thr Ser Ser Gly Lys Gly Ser Trp Lys Trp Ala Cys Pro Arg Phe 
35 40 45 

Ser Asp Glu Gly Thr His Glu Ser Gin He Ser Phe Thr He Glu Gly 
50 55 60 

Pro Leu Thr Ser Phe Gly Leu Ser Asn Lys He Asn Ser 
65 70 75 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 99 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Trp Tyr Trp Gly Asp He Ser Arg Glu Glu Val Asn Glu Lys Leu Arg 
15 10 15 

Asp Thr Pro Asp Gly Thr Phe Leu Val Arg Asp Ala Ser Ser Lys He 
20 25 30 

Gin Gly Asp Tyr Thr Leu Thr Leu Arg Lys Gly Gly Asn Asn Lys Leu 
35 40 45 

He Lys Val Phe His Arg Asp Gly Lys Tyr Gly Phe Ser Glu Pro Leu 
50 55 60 

Thr Phe Cys Ser Val Val Asp Leu He Thr His Tyr Arg His Glu Ser 
65 70 75 80 

Leu Ala Gin Tyr Asn Ala Lys Leu Asp Thr Arg Leu Leu Tyr Pro Val 
85 90 95 

Ser Lys Tyr 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
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Phe Phe Ser Ala Gin Pro Phe Glu He Thr Ala Ser Thr Asn Asn Glu 
15 10 15 

Gly Asn Leu Lys Val Arg Phe Pro Leu Arg Leu Thr Gly Lys He Asp 
20 25 30 

Phe Leu Asn Asn Tyr Ala Leu Phe Leu Ser Pro Ser Ala Gin Gin Ala 
35 40 45 

Ser Trp Gin Val Ser Ala Arg Phe Asn Gin Tyr Lys Tyr Asn Gin Asn 

50 55 60 

Phe Ser Ala Gly Asn Asn Glu Asn He Met Glu Ala His Val Gly He 
65 70 75 80 

Asn Gly Glu Ala Asn Leu Asp Phe Leu Asn He Pro Leu Thr He Pro 
85 90 95 



Glu Met Arg Leu 
100 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 106 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Trp Phe His Gly Lys Leu Gly Ala Gly Arg Asp Gly Arg His He Ala 
15 10 15 

Glu Arg Leu Leu Thr Glu Tyr Cys He Glu Thr Gly Ala Pro Asp Gly 
20 25 30 

Ser Phe Leu Val Arg Glu Ser Glu Thr Phe Val Gly Asp Tyr Thr Leu 
35 40 45 

Ser Phe Trp Arg Asn Gly Lys Val Gin His Cys Arg He His Ser Arg 
50 55 60 

Gin Asp Ala Gly Thr Pro Lys Phe Phe Leu Thr Asp Asn Leu Val Phe 
65 70 75 80 

Asp Ser Leu Tyr Asp Leu He Thr His Tyr Gin Gin Val Pro Leu Arg 
85 90 95 

Cys Asn Glu Phe Glu Met Arg Leu Ser Glu 
100 105 
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(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 91 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Phe Pro Gly Lys Pro Gly He Tyr Thr Arg Glu Glu Leu O/s Thr Met 
15 10 15 

Phe He Arg Glu Val Gly Thr Val Leu Ser Gin Val Tyr Ser Lys Val 
20 25 30 

His Asn Gly Ser Glu He Leu Phe Ser Tyr Phe Gin Asp Leu Val He 
35 40 45 

Thr Leu Pro Phe Glu Leu Arg Lys His Lys Leu He Asp Val He Ser 
50 55 60 

Met Tyr Arg Glu Leu Leu Lys Asp Leu Ser Lys Glu Ala Gin Glu Val 
65 70 75 80 

Phe Lys Ala He Gin Ser Leu Lys Thr Thr Glu 
85 90 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 203 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Val Ser Asp Gly He Ala Ala Leu Asp Leu Asn Ala Val Ala Asn Lys 
15 10 15 

He Ala Asp Phe Glu Leu Pro Thr He He Val Pro Glu Gin Thr He 
20 25 30 

Glu He Pro Ser He Lys Phe Ser Val Pro Ala Gly He Val He Pro 
35 40 45 

Ser Phe Gin Ala Leu Thr Ala Arg Phe Glu Val Asp Ser Pro Val Tyr 
50 55 60 

Asn Ala Thr Trp Ser Ala Ser Leu Lys Asn Lys Ala Asp Tyr Val Glu 
65 70 75 80 

Thr Val Leu Asp Ser Thr Cys Ser Ser Thr Val Gin Phe Leu Glu Tyr 
85 90 95 
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Glu Leu Asn Val Leu Qly Thr His 
100 

Ser Lys Thr Lys Gly Thr Leu Ala 
115 120 

Glu Glu Asp Gly Lys Phe Glu Gly 
130 135 

His Leu Asn lie Lys Ser Pro Ala 
145 150 

Gin Lys Asp Lys Lys Gly lie Ser 
165 

Gly Thr Val Gly Met Asp Met Asp 

leo 

Asn Phe Tyr Tyr Ser Pro Gin Ser 
195 200 



Lys lie Glu Asp Gly Thr Leu Ala 
105 110 

His Arg Asp Phe Ser Ala Glu Tyr 
125 

Leu Gin Glu Trp Glu Gly Lys Ala 
140 

Phe Thr Asp Leu His Leu Arg Tyr 
155 160 

Thr Ser Ala Ala Ser Pro Ala Val 
170 175 

Glu Asp Asp Asp Phe Ser Lys Trp 
185 190 

Ser Pro Asp 



(2) INFORMATION FOR SEQ ID NO: 47: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

Leu Gly Gin Gly Cys Phe Giy Glu Val Trp Met Gly Thr Trp Asn Gly 
15 10 15 

Thr Thr Arg Val Ala He Lys Thr Leu Lys Pro Gly Thr Met Ser Pro 
20 25 30 

Glu Ala Phe Leu Gin Glu Ala Gin Val Met Lys Lys Leu Arg His Glu 
35 40 45 

Lys Leu Val Gin Leu Tyr Ala Val Val Ser Glu Glu Pro He Tyr He 
50 55 60 

Val Thr Glu Tyr Met Ser Lys Gly Ser Leu Leu Asp Phe Leu Lys Gly 
^5 70 75 80 

Glu Thr Gly Lys Tyr Leu Arg Leu Pro Gin Leu Val Asp Met Ala Ala 
85 90 95 

Gin He Ala Ser Gly Met Ala Tyr Val Glu Arg Met Asn Tyr Val His 
100 105 110 

Arg Asp Leu Arg Ala Ala Asn He Leu Val Gly Glu Asn Leu Val Cys 
115 120 125 
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Lys Val Ala Asp Phe Gly Leu Ala Arg Leu He Glu Asp Asn Glu Tyr 
130 135 140 

Thr Ala Arg Gin Gly Ala Lys Phe Pro He Lys Trp Thr Ala Pro Glu 
145 150 155 160 

Ala Ala Leu Tyr Gly Arg Phe Thr He Lys Ser Asp Val Trp Ser Phe 
165 170 175 

Gly He Leu Leu Thr Glu Leu Thr Thr Lys Gly Arg Val Pro Tyr Pro 
180 185 190 

Gly Met Val Asn Arg Glu Val Leu Asp Gin Val Glu Arg Gly Tyr Arg 
195 200 205 

Met Pro Cys Pro Pro Glu 
210 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 213 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Leu Gly Asn Gly Gin Phe Gly Glu Val Trp Met Gly Thr Trp Asn Gly 
15 10 15 

Asn Thr Lys Val Ala He Lys Thr Leu Lys Pro Gly Thr Met Ser Pro 
20 25 30 

Glu Ser Phe Leu Glu Glu Ala Gin He Met Lys Lys Leu Lys His Asp 
35 40 45 



Lys Leu Val Gin Leu Tyr Ala Val Val Ser Glu Glu Pro He Tyr He 
50 55 60 

Val Thr Glu Tyr Met Asn Lys Gly Ser Leu Leu Asp Phe Leu Lys Asp 
65 70 75 80 



Gly Glu Gly Arg Ala Leu Lys Leu 
65 

Gin Val Ala Ala Gly Met Ala Tyr 
100 

Arg Asp Leu Arg Ser Ala Asn He 
115 120 

Lys He Ala Asp Phe Gly Leu Ala 
130 135 



Pro Asn Leu Val Asp Met Ala Ala 
90 95 

He Glu Arg Met Asn Tyr He His 
105 110 

Leu Val Gly Asn Gly Leu He Cys 
125 

Arg Leu He Glu Asp Asn Glu Tyr 
140 
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Thr Ala Arg Gin Gly Ala Lys Phe Pro lie Lys Trp Thr Ala Pro Glu 
145 150 155 160 

Ala Ala Leu Tyr Gly Arg Phe Thr lie Lys Ser Asp Val Trp Ser Phe 
165 170 175 

Gly lie Leu Leu Thr Glu Leu Val Thr Lys Gly Arg Val Pro Tyr Pro 
180 185 190 

Gly Met Asn Asn Arg Glu Val Leu Glu Gin Val Glu Arg Gly Tyr Arg 
195 200 205 

Met Pro Cys Pro Gin 
210 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

Leu Gly Ala Gly Gin Phe Gly Glu Val Trp Met Ala Thr Tyr Asn Lys 
15 10 15 

His Thr Lys Val Ala Val Lys Thr Met Lys Pro Gly Ser Met Ser Val 

20 25 30 

Glu Ala Phe Leu Ala Glu Ala Asn Val Met Lys Thr Leu Gin His Asp 
35 40 45 

Lys Leu Val Lys Leu His Ala Val Val Thr Lys Glu Pro He Tyr He 
50 55 60 

He Thr Glu Phe Met Ala Lys Gly Ser Leu Leu Asp Phe Leu Lys Ser 
65 70 75 80 

Asp Glu Gly Ser Lys Gin Pro Leu Pro Lys Leu He Asp Phe Ser Ala 
85 90 95 

Gin He Ala Glu Gly Met Ala Phe He Glu Gin Arg Asn Tyr He His 
100 105 110 

Arg Asp Leu Arg Ala Ala Asn He Leu Val Ser Ala Ser Leu Val Cys 
115 120 125 

Lys He Ala Asp Phe Gly Leu Ala Arg Val He Glu Asp Asn Glu Tyr 
130 135 140 

Thr Ala Arg Glu Gly Ala Lys Phe Pro He Lys Trp Thr Ala Pro Glu 
145 150 155 160 



SUBSTITUTE SHEET {RULE 26) 



wo 98/56938 



PCT/US98/11927 



130 



Ala lie Asn Phe Gly Ser Phe Thr lie Lys Ser Asp Val Trp Ser Phe 
165 170 175 

Gly He Leu Leu Met Glu He Val Thr Tyr Gly Arg He Pro Tyr Pro 
180 185 190 

Gly Met Ser Asn Pro Glu Val He Arg Ala Leu Glu Arg Gly Tyr Arg 
195 200 205 

Met Pro Arg Pro Glu 
210 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 218 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Leu Gly Ala Gly Gin Phe Gly Glu Val Trp Met Gly Tyr Tyr Asn Asn 
15 10 15 

Ser Thr Lys Val Ala Val Lys Thr Leu Lys Pro Gly Thr Met Ser Val 
20 25 30 

Gin Ala Phe Leu Glu Glu Ala Asn Leu Met Lys Thr Leu Gin His Asp 
35 40 45 

Lys Leu Val Arg Leu Tyr Ala Val Val Thr Arg Glu Glu Pro He Tyr 
50 55 60 

He He Thr Glu Tyr Met Ala Lys Gly Ser Leu Leu Asp Phe Leu Lys 
^5 70 75 80 

Ser Asp Glu Gly Gly Lys Val Leu Leu Pro Lys Leu He Asp Phe Ser 
85 90 95 

Ala Gin He Ala Glu Gly Met Ala Tyr He Glu Arg Lys Asn Tyr He 
100 105 110 

His Arg Asp Leu Arg Xia Ala Asn Val Leu Val Ser Glu Ser Leu Met 
115 120 125 

Cys Lys He Ala Asp Phe Gly Leu Ala Arg Val He Glu Asp Asn Glu 
130 135 140 

Tyr Thr Ala Arg Glu Gly Ala Lys Phe Pro He Lys Trp Thr Ala Pro 
145 150 155 160 

Glu Ala He Asn Phe Gly Cys Phe Thr He Lys Ser Asp Val Trp Ser 
165 170 175 
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Phe Gly He Leu Leu Tyr Glu He Val Thr Tyr Gly Lys He Pro Tyr 
180 185 190 

Pro Gly Arg Thr Asn Ala Asp Val Met Thr Ala Leu Ser Gin Gly Tyr 
195 200 205 

Arg Met Pro Arg Val Glu Asn Cys Pro Asp 
210 215 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

Leu Gly Ala Gly Gin Phe Gly Glu Val Trp Met Gly Tyr Tyr Asn Gly 
^5 10 15 

His Thr Lys Val Ala Val Lys Ser Leu Lys Gin Gly Ser Met Ser Pro 
20 25 30 

Asp Ala Phe Leu Ala Glu Ala Asn Leu Met Lys Gin Leu Gin His Gin 
35 40 45 

Arg Leu Val Arg Leu Tyr Ala Val Val Thr Gin Glu Pro He Tyr He 
50 55 60 

He Thr Glu Tyr Met Glu Asn Gly Ser Leu Val Asp Phe Leu Lys Thr 
^5 70 75 80 

Pro Ser Gly He Lys Leu Thr He Asn Lys Leu Leu Asp Met Ala Ala 
85 90 95 

Gin He Ala Glu Gly Met Ala Phe He Glu Glu Arg Asn Tyr He His 
100 105 110 

Arg Asp Leu Arg Ala Ala Asn He Leu Val Ser Asp Thr Leu Ser Cys 
115 120 125 

Lys He Ala Asp Phe Gly Leu Ala Arg Leu He Glu Asp Asn Glu Tyr 
"0 135 140 

Thr Ala Arg Glu Gly Ala Lys Phe Pro He Lys Trp Thr Ala Pro Glu 

150 155 160 

Ala He Asn Tyr Gly Thr Phe Thr He Lys Ser Asp Val Trp Ser Phe 
165 170 175 

Gly He Leu Leu Thr Glu He Val Thr His Gly Arg He Pro Tyr Pro 
180 185 190 

SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 



PCT/US98/11927 



-132- 



Gly Met Thr Asn Pro Glu Val lie Gin Asn Leu Glu Arg Gly Tyr Arg 
195 200 205 

Met Val Arg Pro Asp 
210 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Arg Lys Asn Tyr lie His Arg Asp Leu Arg Ala Ala Asn 
15 10 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY; linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Lys Gly Thr Leu Ala His Arg Asp Phe Ser Ala Glu 
15 10 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Thr Lys Val Ala Val Lys Thr Leu Lys Pro Gly 
15 10 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
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(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Asp Lys Val Ala He Lys Thr He Arg Glu Gly 
15 10 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Asp Leu Asn Ala Val Ala Asn Lys He Ala Asp 
15 10 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Thr Ser Leu Arg Ala Pro Thr Met Pro Pro Pro Leu Pro Pro Val Pro 
15 10 15 

Pro Gin Pro Ala Arg Arg Gin Ser Arg Arg Leu Pro Ala Ser Pro Val 
20 25 30 

He Ser 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Ser Asp Ala Glu Gly Thr Ala Val Ala Pro Pro Thr Val Thr Pro Val 
15 10 15 
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Pro Ser Leu Glu Ala Pro Ser Glu Gin Ala Pro Thr Glu Gin Arg Pro 
20 25 30 

Gly Val Gin Glu 
35 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Ser Asp Ala Glu Gly Thr Ala Val Ala Pro Pro Thr He Thr Pro He 
15 10 15 

Pro Ser Leu Glu Ala Pro Ser Glu Gin Ala Pro Thr Glu Gin Arg Pro 
20 25 30 

Gly Val Gin Glu 
35 



(2) INFORMATION FOR SEQ ID NO : 60: 

(i) SEQUENCE CH/U^CTERISTICS : 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Ser Asp Ala Glu Trp Thr Ala Phe Val Pro Pro Asn Val He Leu Ala 
15 10 15 

Pro Ser Leu Glu Ala Phe Phe Glu Gin Ala Leu Thr Glu Glu Thr Pro 
20 25 30 

Gly Val Gin Asp 
35 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
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Leu Val Thr Glu Ser Ser Val Leu Ala Thr Leu Thr Val Val Pro Asp 
15 10 15 

Pro Ser Thr Glu Ala Ser Ser Glu Glu Ala Pro Thr Glu Gin Ser Pro 
20 25 30 

Gly Val Gin Asp 
35 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Pro Val Met Glu Ser Thr Leu Leu Thr Thr Pro Thr Val Val Pro Val 
15 10 15 

Pro Ser Thr Glu Leu Pro Ser Glu Glu Ala Pro Thr Glu Asn Ser Thr 
20 25 30 

Gly Val Gin Asp 
35 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Pro Val Thr Glu Ser Ser Val Leu Thr Thr Pro Thr Val Ala Pro Val 
15 10 15 

Pro Ser Thr Glu Ala Pro Ser Glu Gin Ala Pro Pro Glu Lys Ser Pro 
20 25 30 

Val Val Gin Asp 
35 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Ser Glu Thr Glu Ser Gly Val Leu Glu Thr Pro Thr Val Val Pro Glu 
15 10 15 

Pro Ser Met Glu Ala His Ser Glu Ala Ala Pro Thr Glu Gin Thr Pro 
20 25 30 

Val Val Arg Gin 
35 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Ser Asp Thr Glu Ser Gly Thr Val Val Ala Pro Pro Thr Val lie Gin 
15 10 15 

Val Pro Ser Leu Gly Pro Pro Ser Glu Gin Asp 
20 25 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Pro Lys Asp Ala Thr Arg Phe Lys His Leu Arg Lys Tyr Thr Tyr Asn 
15 10 15 

Tyr Glu Ala Glu Ser Ser Ser Gly Val Pro Gly Thr Ala Asp Ser Arg 
20 25 30 

Ser Ala Thr Arg lie 
35 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 PCTAJS98/11927 

.137- 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Pro Lys Asp Ala Ser Gin Arg Arg Arg Ser Leu Glu Pro Ala Glu Asn 
15 10 15 

Val His Gly Ala Gly Gly Gly Ala Phe Pro Ala Ser Gin Thr Pro Ser 
20 25 30 

Lys Pro 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Asp Lys Glu Ala Thr Lys Leu Thr Glu Glu Arg Asp Gly Ser Leu Asn 
15 10 15 

Gin Ser Ser Gly Tyr Arg Tyr Gly Thr Asp Pro Thr Pro Gin His Tyr 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

He Gin Asn Tyr His Thr Phe Leu He Tyr He Thr Glu Leu Leu Lys 
15 10 15 

Lys Leu Gin Ser Thr Thr Val Met Asn Pro Tyr Met Lys Leu Ala Pro 
20 25 30 

Gly Glu Leu Thr He He Leu 
35 



(2) INFORMATION FOR SEQ ID NO: 70: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Pro Glu Glu Arg Pro Thr Phe Glu Tyr Leu Gin Ala Phe Leu Glu Asp 
15 10 15 

Tyr Phe Thr Ser Thr Glu Pro Gin Tyr Gin Pro Gly Glu Asn Leu 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

Pro Glu Glu Arg Pro Thr Phe Glu Tyr Leu Gin Ser Phe Leu Glu Asp 
15 10 15 

Tyr Phe Thr Ala Thr Glu Pro Gin Tyr Gin Pro Gly Glu Asn Leu 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

ID) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Pro Glu Glu Arg Pro Thr Phe Glu Tyr He Gin Ser Val Leu Asp Asp 
15 10 15 

Phe Tyr Thr Ala Thr Glu Ser Gin Tyr Gin Gin Gin Pro 
20 25 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Ala Glu Glu Arg Pro Thr Phe Asp Tyr Leu Gin Ser Val Leu Asp Asp 
15 10 15 

Phe Tyr Thr Ala Thr Glu Gly Gin Tyr Gin Gin Gin Pro 
20 25 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Pro Glu Asp Arg Pro Thr Phe Asp Tyr Leu Arg Ser Val Leu Glu Asp 
15 10 15 

Phe Phe Thr Ala Thr Glu Gly Gin Tyr Gin Pro Gin Pro 
20 25 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Pro Xaa Xaa Xaa Xaa Pro 
1 5 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 40 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Pro Asp Phe Arg Leu Pro Glu He Ala He Pro Glu Phe He He Pro 
15 10 15 

Thr Leu Asn Leu Asn Asp Phe Gin Val Pro Asp Leu His He Pro Glu 
20 25 30 
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Phe Gin Leu Pro His lie Ser His 
35 40 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Pro Gin Asn Ala Lys Leu Lys lie Lys Arg Pro Val Lys Val Gin Pro 
15 10 15 

lie Ala Arg Val Trp Tyr 
20 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 
<C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Pro Asp Phe Arg Leu Pro Glu lie Ala lie Pro Glu Phe He He Pro 
15 10 15 

Thr Leu Asn Leu Asn Asp 
20 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Asn Asp Phe Gin Val Pro Asp Leu His He Pro Glu Phe Gin Leu Pro 
15 10 15 

His He Ser His Thr He 
20 
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(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Pro Ser Leu Glu Leu Pro Val Leu His^ Val Pro Arg Asn Leu Lys Leu 
1 5 10 15 

Ser Leu Pro His Phe Lys 
20 



(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 379 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Met Ala Ser Gly Arg Ala Arg Cys Thr Arg Lys Leu Arg Asn Trp Val 
15 10 15 

Val Glu Gin Val Glu Ser Gly Gin Phe Pro Gly Val Cys Trp Asp Asp 
20 25 30 

Thr Ala Lys Thr Met Phe Arg He Pro Trp Lys His Ala Gly Lys Gin 
35 40 45 

Asp Phe Arg Glu Ser Gin Asp Ala Ala Phe Phe Lys Ala Trp Ala He 
50 55 60 

Phe Lys Gly Lys Tyr Lys Glu Gly Asp Lys Glu Val Pro Glu Arg Gly 
65 70 75 80 

Arg Met Asp Val Ala Glu Pro Tyr Lys Val Tyr Gin Leu Leu Pro Pro 
85 90 95 

Gly He Val Ser Gly Gin Pro Gly Thr Gin Lys Val Pro Ser Lys Arg 
100 105 110 

Gin His Ser Ser Val Ser Ser Glu Arg Lys Glu Glu Asp Ala Met Gin 
115 120 125 

Asn Cys Thr Leu Ser Pro Ser Val Leu Gin Asp Ser Leu Asn Asn Glu 
130 135 140 

Glu Gly Ala Ser Gly Gly Ala Val His Ser Asp He Gly Ser Ser Ser 
145 150 155 160 
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Ser Ser Ser Ser Pro Glu Pro Gin Glu Val Thr Asp Thr Thr Glu Ala 
165 170 175 

Pro Phe Gin Gly Asp Gin Arg Ser Leu Glu Phe Leu Leu Pro Pro Glu 
180 185 190 

Pro Asp Tyr Ser .Leu Leu Leu Thr Phe lie Tyr Asn Gly Arg Val Val 
195 200 205 

Gly Glu Ala Gin Val Gin Ser Leu Asp Cys Arg Leu Val Ala Glu Pro 
210 215 220 

Ser Gly Ser Glu Ser Ser Met Glu Gin Val Leu Phe Pro Lys Pro Gly 
225 230 235 240 

Pro Glu Pro Thr Gin Arg Leu Leu Ser Gin Leu Glu Arg Gly lie Leu 
245 250 255 

Val Ala Ser Asn Pro Arg Gly Leu Phe Val Gin Arg Leu Cys Pro He 
260 265 270 

Pro He Ser Trp Asn Ala Pro Gin Ala Pro Pro Gly Pro Gly Pro His 
275 280 285 

Leu Leu Pro Ser Asn Glu Cys Val Glu Leu Phe Arg Thr Ala Tyr Phe 
290 295 300 

Cys Arg Asp Leu Val Arg Tyr Phe Gin Gly Leu Gly Pro Pro Pro Lys 
305 310 315 320 

Phe Gin Val Thr Leu Asn Phe Trp Glu Glu Ser His Gly Ser Ser His 
325 330 335 

Thr Pro Gin Asn Leu He Thr Val Lys Met Glu Gin Ala Phe Ala Arg 
340 345 350 

Tyr Leu Lys Met Glu Gin Ala Phe Ala Arg Tyr Leu Leu Glu Gin Thr 
355 360 365 

Pro Glu Gin Gin Ala Ala He Leu Ser Leu Val 
370 375 



(2) INFORMATION FOR SEQ ID NO: 82; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 383 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQtJENCE DESCRIPTION: SEQ ID NO: 82: 

Val Ser Leu Val Cys Pro Lys Asp Ala Thr Arg Phe Lys His Leu Arg 
1 5 10 15 
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Lys Tyr Thr Tyr Asn Tyr Glu Ala Glu Ser Ser Ser Gly Val Pro Gly 
20 25 30 

Thr Ala Asp Ser Arg Ser Ala Thr Arg lie Asn Cys Lys Val Glu Leu 
35 40 45 

Glu Val Pro Gin Leu Cys Ser Phe lie Leu Lys Thr Ser Gin Cys Thr 
50 55 60 

Leu Lys Glu Val Tyr Gly Phe Asn Pro Glu Gly Lys Ala Leu Leu Lys 
65 70 75 80 

Lys Thr Lys Asn Ser Glu Glu Phe Ala Ala Ala Met Ser Arg Tyr Glu 
85 90 95 

Leu Lys Leu Ala He Pro Glu Gly Lys Gin Val Phe Leu Tyr Pro Glu 
100 105 110 

Lys Asp Glu Pro Thr Tyr He Leu Asn He Lys Arg Gly He He Ser 
115 120 125 

Ala Leu Leu Val Pro Pro Glu Thr Glu Glu Ala Lys Gin Val Leu Phe 
130 135 140 

Leu Asp Thr Val Tyr Gly Asn Cys Ser Thr His Phe Thr Val Lys Thr 
145 150 155 160 

Arg Lys Gly Asn Val Ala Thr Glu He Ser Thr Glu Arg Asp Leu Gly 
165 170 175 

Gin Cys Asp Arg Phe Lys Pro He Arg Thr Gly He Ser Pro Leu Ala 
180 185 190 

Leu He Lys Gly Met Thr Arg Pro Leu Ser Thr Leu He Ser Ser Ser 
195 200 205 

Gin Ser Cys Gin Tyr Thr Leu Asp Ala Lys Arg Lys His Val Ala Glu 
210 215 220 

Ala He Cys Lys Glu Gin His Leu Phe Leu Pro Phe Ser Tyr Lys Asn 
225 230 235 240 

Lys Tyr Gly Met Val Ala Gin Val Thr Gin Thr Leu Lys Leu Glu Asp 
245 250 255 

Thr Pro Lys He Asn Ser Arg Phe Phe Gly Glu Gly Thr Lys Lys Met 
260 265 270 

Gly Leu Ala Phe Glu Ser Thr Lys Ser Thr Ser Pro Pro Lys Gin Ala 
275 280 285 

Glu Ala Val Leu Lys Thr Leu Gin Glu Leu Lys Lys Leu Thr He Ser 
290 295 300 
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Glu Gin Asn lie Gin Arg Ala Asn Leu Phe Asn Lys Leu Val Thr Glu 
305 310 315 320 

Leu Arg Gly Leu Ser Asp Glu Ala Val Thr Ser Leu Leu Pro Gin Leu 
325 330 335 

He Glu Val Ser Ser Pro He Thr Leu Gin Ala Leu Val Gin Cys Gly 
340 345 350 

Gin Pro Gin Cys Ser Thr His He Leu Lys Arg Val His Ala Asn Pro 
355 360 365 

Leu Leu He Asp Val Val Thr Tyr Leu Val Ala Leu He Pro Glu 
370 375 380 



SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Gly Leu Ser Asn Lys He Asn Ser Lys His Leu Arg Val Asn Gin 
5 10 15 



(2) INFORMATION FOR SEQ ID NO : 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 394 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) 

Phe 
1 

Asn Leu Val Tyr Glu Ser Gly Ser 
20 

Gin Ser Gin Val Asp Ser Gin His 
35 40 

Lys Gly Met Ala Leu Phe Gly Glu 
50 55 

His Asp Ala His Leu Asn Gly Lys 
65 70 

Leu Phe Phe Ser Ala Gin Pro Phe 
85 

Glu Gly Asn Leu Lys Val Arg Phe 
100 

Asp Phe Leu Asn Asn Tyr Ala Leu 

115 120 

Ala Ser Trp Gin Val Ser Ala Arg 
130 135 

Asn Phe Ser Ala Gly Asn Asn 
145 ISO 



Leu Asn Phe Ser Lys Leu Glu He 
25 30 

Val Gly His Ser Val Leu Thr Ala 
45 

Gly Lys Ala Glu Phe Thr Gly Arg 
60 

Val He Gly Thr Leu Lys Asn Ser 
75 80 

Glu He Thr Ala Ser Thr Asn Asn 
90 95 

Pro Leu Arg Leu Thr Gly Lys He 
105 110 

Phe Leu Ser Pro Ser Ala Gin Gin 
125 

Phe Asn Gin Tyr Lys Tyr Asn Gin 
140 



Glu Asn He Met Glu Ala His Val Gly 
155 160 
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Ile Asn Gly Glu Ala Asn Leu Asp Phe Leu Asn lie Pro Leu Thr lie 
165 170 175 

Pro Glu Met Arg Leu Pro Tyr Thr He He Thr Thr Pro Pro Leu Lys 
180 185 190 

Asp Phe Ser Leu Trp Glu Lys Thr Gly Leu Lys Glu Phe Leu Lys Thr 
195 200 205 

Thr Lys Gin Ser Phe Asp Leu Ser Val Lys Ala Gin Tyr Lys Lys Asn 
210 215 220 

Lys His Arg His Ser He Asn Pro Leu Ala Val Leu Cys Glu Phe He 
225 230 235 240 

Ser Gin Ser He Lys Ser Phe Asp Arg His Phe Glu Lys Asn Arg Asn 
245 250 255 

Asn Ala Leu Asp Phe Val Thr Lys Ser Tyr Asn Glu Thr Lys He Lys 
260 265 270 

Phe Asp Lys Tyr Lys Ala Glu Lys Ser His Asp Glu Leu Pro Arg Thr 
275 280 265 

Phe Gin He Pro Gly Tyr Thr Val Pro Val Val Asn Val Glu Val Ser 
290 295 300 

Pro Phe Thr He Glu Met Ser Ala Phe Gly Tyr Val Phe Pro Lys Ala 
305 310 315 320 

Val Ser Met Pro Ser Phe Ser He Leu Gly Ser Asp Val Arg Val Pro 
325 330 335 

Ser Tyr Thr Leu He Leu Pro Ser Leu Glu Leu Pro Val Leu His Val 
340 345 350 

Pro Arg Asn Leu Lys Leu Ser Leu Pro His Phe Lys Glu Leu Cys Thr 
355 360 365 

He Ser His He Phe He Pro Ala Met Gly Asn He Thr Tyr Asp Phe 
370 375 380 

Ser Phe Lys Ser Ser Val He Thr Leu Asn 
385 390 



(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 



PCT/US98/U927 



- 146- 

Met Ala Ser Gly Arg Ala Arg Cys Thr Arg Lys Leu Arg Asn Trp Val 
15 10 15 

Val Glu Gin Val Glu Ser Gly Gin Phe Pro Gly Val Cys Trp Asp Asp 
20 25 30 

Thr Ala Lys Thr Met Phe Arg He Pro Trp Lys His Ala Gly Lys Gin 
35 40 45 

Asp Phe Arg 
50 



(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Pro Lys Asp Ala Thr Arg Phe Lys His Leu Arg Lys Tyr Thr Tyr Asn 
15 10 15 

Tyr Glu Ala Glu Ser Ser Ser Gly Val Pro Gly Thr Ala Asp Ser Arg 
20 25 30 

Ser Ala Thr Arg He Asn Cys Lys Val Glu Leu Glu Val Leu Pro Gin 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Pro Glu Gly Lys Ala Leu Leu Lys Lys Thr Lys Asn Ser Glu Glu Phe 
15 10 15 

Ala Ala Ala Met Ser Arg Tyr Glu Leu Lys Leu Ala He Pro Glu Gly 
20 25 30 

Lys Gin Val Phe Leu 
35 



(2) INFORMATION FOR SEQ ID NO: 87: 
<i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 38 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Cys Ser Thr His Phe Thr Val Lys Thr Arg Lys Gly Asn Val Ala Thr 
15 10 15 

Glu lie Ser Thr Glu Arg Asp Leu Gly Gin Cys Asp Arg Phe Lys Pro 
20 25 30 



SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Ser Thr His He Leu Gin Trp Leu Lys Arg Val His Ala Asn Pro 
5 10 15 



He Arg Thr Gly He Ser 
35 



(2) INFORMATION FOR SEQ ID NO : 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) 

Cys 
1 

Leu Leu He Asp Val Val Thr Tyr 

20 

Ser Ala Gin Gin Leu Arg Glu He 
35 40 

Ser Arg Ala 
50 



(2) INP0R^4ATI0N FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



Leu Val Ala Leu He Pro Glu Pro 
25 30 

Phe Asn Met Ala Arg Asp Gin Arg 
45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

His Leu Ser Cys Asp Thr Lys Glu Glu Arg Lys He Lys Gly Val He 
15 10 15 

Ser He Pro Arg Leu Gin Ala Glu Ala Arg Ser Glu He Leu Ala His 
20 25 30 
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Trp Ser Pro Ala Lys Leu 
35 



(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Ser Val His Leu Asp Ser Lys Lys Lys Gin His Leu Phe Val Lys Glu 
15 10 IS 

Val Lys He Asp Gly Gin Phe Arg Val Ser Ser Phe Tyr Ala Lys Gly 

20 25 30 

Thr Tyr Gly Leu Ser Cys Gin Arg Asp Pro Asn Thr Gly Arg Leu 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Lys His He Asn He Asp Gin Phe Val Arg Lys Tyr Arg Ala Ala Leu 
15 10 15 

Gly Lys Leu Pro Gin Gin Ala Asn Asp Tyr Leu Ser Phe Asn Trp Glu 
20 25 30 

Arg Gin Val Ser His Ala Lys Glu 
35 40 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQX7ENCE DESCRIPTION: SEQ ID NO: 92: 

Lys Leu Thr Ala Leu Thr Lys Lys Tyr Arg He Thr Glu Asn Asp He 
15 10 15 
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Gln lie Ala Leu Asp Asp Ala Lys lie Asn Phe Asn Glu Lys Leu Ser 
20 25 30 

Gin Leu Gin Thr Tyr Met lie Gin 
35 40 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Glu Arg lie Asn Asp Val Leu Glu His Val Lys His Phe Val lie Asn 
15 10 15 

Leu lie Gly Asp Phe Glu Val Ala Glu Lys lie Asn Ala Phe Arg Ala 
20 25 30 

Lys Val His Glu Leu lie Glu Arg Tyr Glu Val Asp Gin Gin He Gin 
35 40 45 

Val Leu 
50 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Asn Lys Phe Leu Asp Met Leu He Lys Lys Leu Lys Ser Phe Asp Tyr 
15 10 15 

His Gin Phe Val Asp Glu Thr Asn Asp Lys He Arg Glu Val Thr Gin 
20 25 30 

Arg Leu Asn Gly Glu He Gin Ala Leu Glu Leu Pro Gin Lys Ala Glu 
35 40 45 

Ala Leu 
50 
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(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Ser Asn Lys He Asn Ser Lys His Leu Arg Val Asn Gin Asn Leu Val 
15 10 15 

Tyr Glu Ser Gly Ser Leu Asn 
20 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Phe Ser Lys Leu Glu He Gin Ser Gin Val Asp Ser Gin His Val Gly 
15 10 15 

His Ser Val Leu Thr Ala Lys Gly Met Ala Leu Phe Gly Glu Gly Gly 
20 25 30 

Lys Ala Glu Phe Thr Gly Arg His Asp Ala His Leu Asn Gly Lys 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Val Lys Ala Gin Tyr Lys Lys Asn Lys His Arg His Ser He Thr Asn 
15 10 15 

Pro Leu Ala Val Leu Cys Glu Phe He Ser Gin Ser He Lys Ser Phe 
20 25 30 

Asp Arg His Phe Glu Lys Asn Arg Asn Asn Ala Leu Asp Phe Val Thr 
35 40 45 
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Lys Ser 
50 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Lys Leu Glu Gly Thr Thr Arg Leu Thr Arg Lys Arg Gly Leu Lys Leu 
15 10 15 

Ala Thr Ala Leu Ser Leu Ser Asn Lys Phe Val Glu Gly Ser His Asn 
20 25 30 

Ser Thr Val Ser Leu Thr Thr Lys Asn Met Glu Val Ser Val Ala Lys 
35 40 45 

Thr Thr Lys 

50 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

Lys Leu Asp Val Thr Thr Ser He Gly Arg Arg Gin His Leu Arg Val 
15 10 15 

Ser Thr Ala Phe Val Tyr Thr Lys Asn Pro Asn Gly Tyr Ser Phe Ser 
20 25 30 

He Pro Val Lys Val Leu Ala Asp Lys Phe He Thr Pro Gly Leu Lys 
35 40 45 

Leu Asn Asp 
50 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Phe Arg Glu lie Gin lie Tyr Lys Lys Leu Arg Thr Ser Ser Phe Ala 
15 10 15 

Leu Asn Leu Pro Thr Leu Pro Glu Val Lys Phe Pro Glu Val Asp Val 
20 25 30 

Leu Thr Lys Tyr Ser Gin Pro Glu Asp Ser Leu He Pro Phe Phe Glu 
35 40 45 

He 



(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

Leu His Leu Arg Tyr Gin Lys Asp Lys Lys Gly He Ser Thr Ser Ala 
15 10 15 

Ala Ser Pro Ala Val Gly Thr Val Gly Met Asp Met Asp Glu Asp Asp 
20 25 30 

Asp Phe Ser Lys Trp Asn Phe Tyr Tyr Ser Pro Gin Ser Ser Pro Asp 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Leu Arg Glu Val Ser Ser Lys Leu Arg Arg Asn Leu Gin Asn Asn Ala 
15 10 15 

Glu Trp Val Tyr Gin Gly Ala He Arg Gin He Asp Asp He Asp Val 
20 25 30 

Arg Phe Gin Lys Ala Ala Ser Gly Thr Thr Gly Thr Tyr Gin Glu Trp 
35 40 45 
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(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

Arg Val Thr Gin Lys Phe His Met Lys Val Lys His Leu He Asp Ser 
15 10 15 

Leu He Asp Phe Leu Asn Phe Pro Arg Phe Gin Phe Pro Gly Lys Pro 
20 25 30 

Gly He Tyr Thr Arg Glu Glu Leu Cys Thr Met Phe He Arg Glu Val 
35 40 45 

Gly Thr 
50 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 
(C> STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Trp Lys His Ala Gly Lys Gin Asp Phe Arg Glu Ser Gin Asp Ala Ala 
15 10 15 

Phe Phe Lys Ala Trp Ala He Phe Lys Gly Lys Tyr Lys Glu Gly Asp 
20 25 30 

Lys Glu Val Pro Glu Arg Gly Arg Met Asp Val Ala Glu Pro Tyr Lys 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

Glu His Val Lys His Phe Val He Asn Leu He Gly Asp Phe Glu Val 
15 10 15 
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Ala Glu Lys He Asn Ala Phe Arg Ala Lys Val His Glu Leu He Glu 
20 25 30 

Arg Tyr Glu Val Asp Gin Gin He Gin Val Leu Met Asp Lys Leu Val 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

Val Arg Lys Tyr Arg Ala Ala Leu Gly Lys Leu Pro Gin Gin Ala Asn 
15 10 15 

Asp Tyr Leu Asn Ser Phe Asn Trp Glu Arg Gin Val Ser His Ala Lys 
20 25 30 

Glu Lys Leu Thr Ala Leu Thr Lys Lys Tyr Arg He Thr Glu Asn Asp 
35 40 45 

He Gin He Ala 
50 



(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Tyr He Lys Asp Ser Tyr Asp Leu His Asp Leu Lys He Ala He Ala 
15 10 15 

Asn He He Asp Glu He He Glu Lys Leu Lys Ser Leu Asp Glu His 
20 25 30 

Tyr His He Arg Val Asn Leu Val Lys Thr He His Asp Leu His Leu 
35 40 45 

Phe He Glu Asn He Asp Phe Asn Lys 
50 55 
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(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

Lys He Thr Leu He He Asn Trp Leu Gin Glu Ala Leu Ser Ser Ala 
15 10 15 

Ser Leu Ala His Met Lys Ala Lys Phe Arg Glu Thr Leu Glu Asp Thr 
20 25 30 

Arg 



(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Thr Asp His Phe Ser Leu Arg Ala Arg Tyr His Met Lys Ala Asp Ser 
15 10 15 

Val Val Asp Leu Ser Tyr Asn Val Gin Gly Ser Gly Glu Thr Thr Tyr 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Lys Leu Thr Thr Asn Gly Arg Phe Arg Glu His Asn Ala Lys Phe Ser 
1^5 10 15 

Leu Asp Gly Lys 
20 
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(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

Asp Thr Lys Tyr Gin He Arg He Gin He Gin Glu Lys Leu Gin Gin 
15 10 15 

Leu Lys Arg His He Gin Asn He Asp He Gin His Leu Ala Gly Lys 
20 25 30 

Leu Lys Gin His He Glu Ala He Asp Val Arg Val Leu Leu Asp Gin 
35 40 45 

Leu Gly Thr Thr 
50 



(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 4 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

Phe His Asp Phe Pro Asp Leu Gly Gin Glu Val Ala Leu Asn Ala Asn 
15 10 15 

Thr Lys Asn Gin Lys He Arg Trp Lys Asn Glu Val Arg He His Ser 
20 25 30 

Gly Ser Phe Gin Ser Gin Val Glu Leu Ser Asn Asp Gin 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

Lys Asp Asn Val Phe Asp Gly Leu Val Arg Val Thr Gin Lys Phe His 
15 10 15 
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Met Lys Val Lys His Leu He Asp Ser Leu He Asp Phe Leu Asn Phe 
20 25 30 



Pro Arg 



(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 
(B> TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

His Arg Asn He Gin Glu Tyr Leu Ser He Leu Thr Asp Pro Asp Gly 
15 10 15 

Lys Gly Lys Glu Lys He Ala Glu Leu Ser Ala Thr Ala Gin Glu He 
20 25 30 

He Lys Ser 
35 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 211 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

Glu Phe Thr He Val Ala Phe Val Lys Tyr Asp Lys Asn Gin Asp Val 
15 10 15 

His Ser He Asn Leu Pro Phe Phe Glu Thr Leu Gin Glu Tyr Phe Glu 
20 25 30 

Arg Asn Arg Gin Thr He He Val Val Leu Glu Asn Val Gin Arg Lys 
35 40 45 

Leu Lys His He Asn He Asp Gin Phe Val Arg Lys Tyr Arg Ala Ala 
50 55 60 

Leu Gly Lys Leu Pro Gin Gin Ala Asn Asp Tyr Leu Asn Ser Phe Asn 
65 70 75 80 

Trp Glu Arg Gin Val Ser His Ala Lys Glu Lys Leu Thr Ala Leu Thr 
85 90 95 
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Lys Lys Tyr Arg He Thr Glu Asn Asp He Gin He Ala Leu Asp Asp 
100 105 110 

Ala Lys He Asn Phe Asn Glu Lys Leu Ser Gin Leu Gin Thr Tyr Met 
115 120 125 

He Gin Phe Asp Gin Tyr He Lys Asp Ser Tyr Asp Leu His Asp Leu 
130 135 140 

Lys He Ala He Ala Asn He He Asp Glu He He Glu Lys Leu Lys 
145 150 155 160 

Ser Leu Asp Glu His Tyr His He Arg Val He Leu Val Lys Thr He 
165 170 175 

His Asp Leu His Leu Phe He Glu Asn He Asp Phe Asn Lys Ser Gly 
180 185 190 

Ser Ser Thr Ala Ser Trp He Gin Asn Val Asp Thr Lys Tyr Gin He 
195 200 205 



Arg He Gin 
210 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

Gly Pro Leu Pro Thr Leu Val Ser Gly Gly Thr He Leu Ala Thr Val 
15 10 15 

Pro Leu Val Val Asp Ala Glu Lys Leu Pro He Asn Arg Leu Ala Ala 
20 25 30 

Gly Ser Lys Ala Pro Ala Ser Ala Gin Ser Arg Gly Glu Lys Arg Thr 
35 40 45 

Ala His Asn Ala He Glu Lys Arg Tyr Arg Ser Ser He Asn Asp Lys 
50 55 60 

He He Glu Leu Lys Asp Leu Val Val Gly Thr Glu Ala Lys Leu Asn 
65 70 75 80 

Lys Ser Ala Val Leu Arg Lys Ala He Asp Tyr He Arg Phe Leu Gin 
85 90 95 

His Ser Asn Gin Lys Leu Lys Gin Glu Asn Leu Ser Leu Arg Thr Ala 
100 105 110 
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Val His Lys Ser Lys Ser Leu Lys Asp Leu Val Ser Ala Cys Gly Ser 
115 120 125 

Gly Gly Asn Thr Asp Val Leu Met Glu Gly Val Lys Thr Glu Val Glu 
130 135 140 

Asp Thr Leu Thr Pro Pro Pro Ser Asp Ala Gly Ser Pro Phe Gin Ser 
145 150 155 160 

Ser Pro Leu Ser Leu Gly Ser Arg Gly Ser Gly Ser Gly Gly 
165 170 



(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 172 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY; linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

Gin Val Pro Thr Leu Val Gly Ser Ser Gly Thr He Leu Thr Thr Met 
15 10 15 

Pro Val Met Met Gly Gin Glu Lys Val Pro He Lys Gin Val Pro Gly 
20 25 30 

Gly Val Lys Gin Leu Glu Pro Pro Lys Glu Gly Glu Arg Arg Thr Thr 
35 40 45 

His Asn He He Glu Lys Arg Tyr Arg Ser Ser He Asn Asp Lys He 
50 55 60 

He Glu Leu Lys Asp Leu Val Met Gly Thr Asp Ala Lys Met His Lys 
65 70 75 80 

Ser Gly Val Leu Arg Lys Ala He Asp Tyr He Lys Tyr Leu Gin Gin 
85 90 95 

Val Asn His Lys Leu Arg Gin Glu Asn Met Val Leu Lys Leu Ala Asn 
100 105 110 

Gin Lys Asn Lys Leu Leu Lys Gly He Asp Leu Gly Ser Leu Val Asp 
115 120 125 

Asn Glu Val Asp Leu Lys He Glu Asp Phe Asn Gin Asn Val Leu Leu 
130 135 140 

Met Ser Pro Pro Ala Ser Asp Ser Gly Ser Gin Ala Gly Phe Ser Pro 
145 150 155 160 

Tyr Ser He Asp Ser Glu Pro Gly Ser Pro Leu Leu 
165 170 
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(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Gly Pro Leu Gin Thr Leu Val Ser Gly Gly Thr He Leu Ala Thr Val 
15 10 15 

Pro Leu Val Val Asp Thr Asp Lys Leu Pro He His Arg Leu Ala Ala 
20 25 30 

Gly Gly Lys Ala Leu Gly Ser Ala Gin Ser Arg Gly Glu Lys Arg Thr 
35 40 45 

Ala His Asn Ala He Glu Lys Arg Tyr Arg Ser Ser He Asn Asp Lys 
50 55 60 

He Val Glu Leu Lys Asp Leu Val Val Gly Thr Glu Ala Lys Leu Asn 
65 70 75 80 

Lys Ser Ala Val Leu Arg Lys Ala He Asp Tyr He Arg Phe Leu Gin 
85 90 95 

His Ser Asn Gin Lys Leu Lys Gin Glu Asn Leu Thr Leu Arg Ser Ala 
100 105 110 

His Lys Ser Lys Ser Leu Lys Asp Leu Val Ser Ala Cys Gly Ser Gly 
115 120 125 

Gly Gly Thr Asp Val Ser Met Glu Gly Met Lys Pro Glu Val Val Glu 
130 135 140 

Thr Leu Thr Pro Pro Pro Ser Asp Ala Gly Ser Pro Ser Gin Ser Ser 
145 150 155 160 

Pro Leu Ser Leu Gly Ser Arg Gly Ser Ser Ser Gly Gly 
165 170 



(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 243 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 



PCT/US98/11927 



-161 



Asp Glu Pro Pro Gin Ser Pro Trp Asp Arg Val Lys Asp Leu Ala Thr 
15 10 15 

Val Tyr Val Asp Val Leu Lys Asp Ser Gly Arg Asp Tyr Val Ser Gin 
20 25 30 

Phe Glu Gly Ser Ala Leu Gly Lys Gin Leu Asn Leu Lys Leu Leu Asp 
35 40 45 

Asn Trp Asp Ser Val Thr Ser Thr Phe Ser Lys Leu Arg Glu Gin Leu 
50 55 60 

Gly Pro Val Thr Gin Glu Phe Trp Asp Asn Leu Glu Lys Glu Thr Glu 
65 70 75 80 

Gly Leu Arg Gin Glu Met Ser Lys Asp Leu Glu Glu Val Lys Ala Lys 
85 90 95 

Val Gin Pro Tyr Leu Asp Asp Phe Gin Lys Lys Trp Gin Glu Glu Met 
100 105 110 

Glu Leu Tyr Arg Gin Lys Val Glu Pro Leu Arg Ala Glu Leu Gin Glu 
115 120 125 

Gly Ala Arg Gin Lys Leu His Glu Leu Gin Glu Lys Leu Ser Pro Leu 
130 135 140 

Gly Glu Glu Met Arg Asp Arg Ala Arg Ala His Val Asp Ala Leu Arg 
145 150 155 160 

Thr His Leu Ala Pro Tyr Ser Asp Glu Leu Arg Gin Arg Leu Ala Ala 
165 170 175 

Arg Leu Glu Ala Leu Lys Glu Asn Gly Gly Ala Arg Leu Ala Glu Tyr 
180 185 190 

His Ala Lys Ala Thr Glu His Leu Ser Thr Leu Ser Glu Lys Ala Lys 
195 200 205 

Pro Ala Leu Glu Asp Leu Arg Gin Gly Leu Leu Pro Val Leu Glu Ser 
210 215 220 

Phe Lys Val Ser Phe Leu Ser Ala Leu Glu Glu Tyr Thr Lys Lys Leu 
225 230 235 240 



Asn Thr Gin 



(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 268 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

Gin Gin Val Pro Val Leu Leu Gin Pro His Phe He Lys Ala Asp Ser 
15 10 15 

Leu Leu Leu Thr Ala Met Lys Thr Asp Gly Ala Thr Val Lys Ala Ala 
20 25 30 

Gly Leu Ser Pro Leu Val Ser Gly Thr Thr Val Gin Thr Gly Pro Leu 
35 40 45 

Pro Thr Leu Val Ser Gly Gly Thr He Leu Ala Thr Val Pro Leu Val 
50 55 60 

Val Asp Ala Glu Lys Leu Pro He Asn Arg Leu Ala Ala Gly Ser Lys 
65 70 75 80 

Ala Pro Ala Ser Ala Gin Ser Arg Gly Glu Lys Arg Thr Ala His Asn 
85 90 95 

Ala He Glu Lys Arg Tyr Arg Ser Ser He Asn Asp Lys He He Glu 
100 105 110 

Leu Lys Asp Leu Val Val Gly Thr Glu Ala Lys Leu Asn Lys Ser Ala 
115 120 125 

Val Leu Arg Lys Ala He Asp Tyr He Arg Phe Leu Gin His Ser Asn 
130 135 140 

Gin Lys Leu Lys Gin Glu Asn Leu Ser Leu Arg Thr Ala Val His Lys 
145 150 155 160 

Ser Lys Ser Leu Lys Asp Leu Val Ser Ala Cys Gly Ser Gly Gly Asn 
165 170 175 

Thr Asp Val Leu Met Glu Gly Val Lys Thr Glu Val Glu Asp Thr Leu 
180 185 190 

Thr Pro Pro Pro Ser Asp Ala Gly Ser Pro Phe Gin Ser Ser Pro Leu 
195 200 205 

Ser Leu Gly Ser Arg Gly Ser Gly Ser Gly Gly Ser Gly Ser Asp Ser 
210 215 220 

Glu Pro Asp Ser Pro Val Phe Glu Asp Ser Lys Ala Lys Pro Glu Gin 
225 230 235 240 

Arg Pro Ser Leu His Ser Arg Gly Met Leu Asp Arg Ser Arg Leu Ala 
245 250 255 

Leu Cys Thr Leu Val Phe Leu Cys Leu Ser Cys Asn 
260 265 
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(2) INFORMATION FOR SEQ ID NO; 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Gin Ala Lys Glu Pro Cys Val Glu Ser Leu Val Ser Gin Tyr Phe Gin 
15 10 15 

Thr Val Thr Asp Tyr Gly Lys Asp Leu Met Glu Lys Val Lys Ser Pro 
20 25 30 

Glu Leu Gin Ala Glu Ala Lys Ser Tyr Phe Glu Lys Ser Lys Glu Gin 
35 40 45 

Leu Thr Pro Leu lie Lys Lys Ala Gly Thr Glu Leu Val Asn Phe Leu 
50 55 60 

Ser Tyr Phe Val Glu Leu Gly Thr Gin Pro Ala Thr Gin 
65 70 75 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Glu Ala Lys Leu Asn Lys Ser Ala Val Leu Arg Lys Ala lie Asp Tyr 
2-5 10 15 

lie Arg Phe Leu Gin His Ser Asn Gin Lys Leu Lys Gin Glu Asn Leu 
20 25 30 

Ser Leu Arg Thr Ala Val His Lys Ser Lys Ser Leu Lys Asp Leu Val 
35 40 45 

Ser Ala Cys Gly Ser Gly Gly Asn Thr Asp Val Leu Met Glu Gly Val 
50 55 60 

Lys Thr Glu Val Glu Asp Thr 
65 70 



(2) INFORMATION FOR SEQ ID NO: 123: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 3 97 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

Gin Lys Ser Glu Leu Thr Gin Gin Leu Asn Ala Leu Phe Gin Asp Lys 
15 10 15 

Leu Gly Glu Val Asn Thr Tyr Ala Gly Asp Leu Gin Lys Lys Leu Val 
20 25 30 

Pro Phe Ala Thr Glu Leu His Glu Arg Leu Ala Lys Asp Ser Glu Lys 
35 40 45 

Leu Lys Glu Glu He Gly Lys Glu Leu Glu Glu Leu Arg Ala Arg Leu 
50 55 60 

Leu Pro His Ala Asn Glu Val Ser Gin Lys He Gly Asp Asn Leu Arg 
65 70 75 80 

Glu Leu Gin Gin Arg Leu Glu Pro Tyr Ala Asp Gin Leu Arg Thr Gin 
85 90 95 

Val Asn Thr Gin Ala Glu Gin Leu Arg Arg Gin Leu Asp Pro Leu Ala 
100 105 110 

Gin Arg Met Glu Arg Val Leu Arg Glu Asn Ala Asp Ser Leu Gin Ala 
115 120 125 

Ser Leu Arg Pro His Ala Asp Glu Leu Lys Ala Lys He Asp Gin Asn 
130 135 140 

Val Glu Glu Leu Lys Gly Arg Leu Thr Pro Tyr Ala Asp Glu Phe Lys 
145 150 155 160 

Val Lys He Asp Gin Thr Val Glu Glu Leu Arg Arg Ser Leu Ala Pro 
165 170 175 

Tyr Ala Gin Asp Thr Gin Glu Lys Leu Asn His Gin Leu Glu Gly Leu 
180 185 190 

Thr Phe Gin Met Lys Lys Asn Ala Glu Glu Leu Lys Ala Arg He Ser 
195 200 205 

Ala Ser Ala Glu He Asp Gin Thr Val Glu Glu Leu Arg Arg Ser Leu 
210 215 220 

Ala Pro Tyr Ala Gin Asp Thr Gin Glu Lys Leu Asn His Gin Leu Glu 
225 230 235 240 

Gly Leu Thr Phe Gin Met Lys Lys Asn Ala Glu Glu Leu Lys Ala Arg 
245 250 255 
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lie Ser Ala Ser Ala Glu Glu Leu Arg Gin Arg Leu Ala Pro Leu Ala 
260 265 270 

Glu Asp Val Arg Gly Asn Leu Lys Gly Asn Thr Glu Gly Leu Gin Lys 
275 280 285 

Ser Leu Ala Glu Leu Gly Gly His Leu Asp Gin Gin Val Glu Glu Phe 
290 295 300 

Arg Arg Arg Val Glu Pro Tyr Gly Glu Asn Phe Asn Lys Ala Leu Val 
305 310 315 320 

Gin Gin Met Glu Gin Leu Arg Gin Lys Leu Gly Pro His Ala Gly Asp 
325 330 335 

Val Glu Gly His Leu Ser Phe Leu Glu Lys Asp Leu Arg Asp Lys Val 
340 345 350 

Asn Ser Phe Phe Ser Thr Phe Lys Glu Lys Glu Ser Gin Asp Lys Thr 
355 360 365 

Leu Ser Leu Pro Glu Leu Glu Gin Gin Gin Glu Gin Gin Gin Glu Gin 
370 375 380 

Gin Gin Glu Gin Val Gin Met Leu Ala Pro Leu Glu Ser 
385 390 395 



SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

Lys Leu Pro He Asn Arg Leu Ala Ala Gly Ser Lys Ala Pro Ala 
5 10 15 



(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 422 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) 

Glu 
1 

Ser Ala Gin Ser Arg Gly Glu Lys 
20 

Lys Arg Tyr Arg Ser Ser 

35 

Leu Val Val Gly Thr Glu Ala Lys 
50 55 

Lys Ala He Asp Tyr He Arg Phe 
65 70 

Lys Gin Glu Asn Leu Ser Leu Arg 
85 



Arg Thr Ala His Asn Ala He Glu 
25 30 



Leu Asn Lys Ser Ala Val Leu Arg 
60 

Leu Gin His Ser Asn Gin Lys Leu 
75 80 

Thr Ala Val His Lys Ser Lys Ser 
90 95 



He Asn Asp Lys He He Glu Leu Lys Asp 
40 45 
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Leu Lys Asp Leu Val Ser Ala Cys Gly Ser Gly Gly Asn Thr Asp Val 
100 105 110 

Leu Met Glu Gly Val Lys Thr Glu Val Glu Asp Thr Leu Thr Pro Pro 
115 120 125 

Pro Arg Asp Ala Gly Ser Pro Phe Gin Ser Ser Pro Leu Ser Leu Gly 
130 135 140 

Ser Arg Gly Ser Gly Ser Gly Gly Ser Gly Ser Asp Ser Glu Pro Asp 
145 ISO 155 160 

Ser Pro Val Phe Glu Asp Ser Lys Ala Lys Pro Glu Gin Arg Pro Ser 
165 170 175 

Leu His Ser Arg Gly Met Leu Asp Arg Ser Arg Leu Ala Leu Cys Thr 
180 185 190 

Leu Val Phe Leu Cys Leu Ser Cys Asn Pro Leu Ala Ser Leu Leu Gly 
195 200 205 

Ala Arg Gly Leu Pro Ser Pro Ser Asp Thr Thr Ser Val Tyr His Ser 
210 215 220 

Pro Gly Arg Asn Val Leu Gly Thr Glu Ser Arg Asp Gly Pro Gly Trp 
225 230 235 240 

Ala Gin Ala Val Gin Leu Phe Leu Cys Asp Leu Leu Leu Val Val Arg 
245 250 255 

Thr Ser Leu Trp Arg Gin Gin Gin Pro Pro Ala Pro Ala Pro Ala Ala 
260 265 270 

Gin Gly Ala Ser Ser Arg Pro Gin Ala Ser Ala Leu Glu lie Arg Gly 
275 280 285 

Phe Gin Arg Asp Leu Ser Ser Leu Arg Arg Leu Ala Gin Ser Phe Arg 
290 295 300 

Pro Ala Met Arg Arg Val Phe Leu His Glu Ala Thr Ala Arg Leu Met 
305 310 315 320 

Ala Gly Ala Ser Pro Thr Arg Thr His Gin Leu Leu Asp Arg Ser Leu 
325 330 335 

Arg Arg Arg Ala Gly Pro Gly Gly Lys Gly Gly Ala Val Ala Glu Leu 
340 345 350 

Glu Pro Arg Pro Thr Arg Arg Glu His Ala Glu Ala Leu Leu Leu Ala 
355 360 365 

Ser Cys Tyr Leu Pro Pro Gly Phe Leu Ser Ala Pro Gly Gin Arg Val 
370 375 380 

Gly Met Leu Ala Glu Ala Ala Arg Thr Leu Glu Lys Leu Gly Asp Arg 
385 390 395 400 
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Arg Leu Leu His Asp Cys Gin Gin Met Leu Met Arg Leu Gly Gly Gly 
405 410 415 

Thr Thr Val Thr Ser Ser 
420 



(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

Glu Lys Met Ser Leu Arg Asn Arg Leu Ser Lys Ser Arg Glu Asn Pro 
15 10 15 

Glu Glu Asp Glu Asp Gin Arg Asn Pro Ala Lys Glu Ser Leu Glu Thr 
20 25 30 

Pro Ser Asn Gly Arg He Asp He Lys Gin Leu He Ala Lys Lys He 
35 40 45 

Lys Leu Thr Ala Asn Gly Arg He Asp He Lys Gin Leu He Ala Lys 
50 55 60 

Lys He Lys Leu Thr Ala Glu Asn Gly Arg He Asp He Lys Gin Leu 
S5 70 75 80 

He Ala Lys Lys He Lys Leu Thr Ala Glu Ala Glu Glu Leu Lys Pro 
85 90 95 

Phe Phe Met Lys Glu Val Gly Ser His Phe Asp Asp Phe Val Thr Asn 
100 105 110 

Leu He Glu Lys Ser Ala Ser Leu Asp Asn Lys Ala His Ser Phe Val 
115 120 125 

Arg Glu Asn Val Pro Arg Val Leu Asn Ser Ala Lys Glu Lys 
130 135 140 



(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 135 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 
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Glu Lys Leu Pro He Asn Arg Leu Ala Ala Gly Ser Lys Ala Pro Ala 
15 10 15 

Ser Ala Gin Ser Arg Gly Glu Lys Arg Thr Ala His Asn Ala lie Glu 
20 25 30 

Lys Arg Tyr Arg Ser Ser He Asn Asp Lys He He Glu Leu Lys Asp 
35 40 45 

Leu Val Val Gly Thr Glu Ala Lys Leu Asn Lys Ser Tyr He Arg Phe 
50 55 60 

Leu Gin His Ser Asn Gin Lys Leu Lys Gin Glu Asn Leu Ser Leu Arg 
65 70 75 80 

Thr Ala Val His Lys Ser Lys Ser Leu Lys Asp Leu Val Ser Ala Cys 
85 90 95 

Gly Ser Gly Gly Asn Thr Asp Val Leu Met Glu Gly Val Lys Thr Glu 
100 105 110 

Val Glu Asp Lys Ala Lys Pro Glu Gin Arg Pro Ser Leu His Ser Arg 
115 120 125 

Gly Met Leu Asp Arg Ser Arg 
130 135 



(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Arg Arg His Cys Pro Leu Lys Asn Pro Thr Phe Leu Asp Tyr Val Arg 
15 10 15 

Pro Arg Ser Trp Thr Cys Arg Tyr Val Phe 
20 25 



(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 
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Arg Arg Arg Ala Gly Pro Gly Gly Lys Gly Gly Ala Val Ala Glu Leu 
15 10 15 

Glu Pro Arg Pro Thr Arg Arg Glu His 
20 25 



(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 114 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

Ala Met Leu Gly Gin Ser Thr Glu Glu Leu Arg Val Arg Leu Ala Ser 

15 10 15 

His Leu Arg Lys Leu Arg Lys Arg Leu Leu Arg Asp Ala Asp Asp Leu 
20 25 30 

Gin Lys Arg Leu Ala Val Tyr Gin Ala Gly Ala Arg Glu Gly Ala Glu 
35 40 45 

Arg Gly Leu Ser Ala lie Arg Glu Arg Leu Gly Pro Leu Val Glu Gin 
50 55 60 

Gly Arg Val Arg Ala Ala Thr Val Gly Ser Leu Ala Gly Gin Pro Leu 
65 70 75 80 

Gin Glu Arg Ala Gin Ala Trp Gly Glu Arg Leu Arg Ala Arg Met Glu 
85 90 95 

Glu Met Gly Ser Arg Thr Arg Asp Arg Leu Asp Glu Val Lys Glu Gin 
100 105 110 

Val Ala 



(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Lys Leu Pro lie Asn Arg Leu Ala Ala Gly Ser Lys Ala Pro Ala Ser 
15 10 15 
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Ala Gin Ser Arg Gly Glu Lys Arg Thr Ala His Asn Ala lie Glu Lys 
20 25 30 

Arg Tyr Arg Ser Ser He Asn Asp Lys He He Glu Leu Lys Asp Leu 
35 40 45 

Val Val Gly Thr Glu Ala Lys Leu Asn Lys Ser Ala Val Leu Arg Lys 
50 55 60 

Ala He Asp Tyr He Arg Phe Leu Gin His Ser Asn Gin Lys Leu Lys 
65 70 75 80 

Gin Glu Asn Leu Ser Leu Arg Thr Ala Val His Lys Ser Lys Ser Leu 
85 90 95 



Lys Asp Leu Val Ser Ala Cys Gly Ser Gly Gly 
100 105 



(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Thr Gin Gin Pro Gin Gin Asp Glu Met Pro Ser Pro Thr Phe Leu Thr 
15 10 15 

Gin Val Lys Glu Ser Leu Ser Ser Tyr Trp Glu Ser Ala Lys Thr Ala 
20 25 30 

Ala Gin Asn Leu Tyr Glu Lys Thr Tyr Leu 
35 40 



(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

Ser Gin He Gin Gin Val Pro Val Leu Leu Gin Pro His Phe He Lys 
15 10 15 

Ala Asp Ser Leu Leu Leu Thr Ala Met Lys Thr Asp Gly Ala Thr Val 
20 25 30 
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Lys Ala Ala Gly Leu Ser Pro Leu Val Ser Gly Thr Thr 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

Ser Leu Leu Ser Phe Met Gin Gly Tyr Met Lys His Ala Thr Lys Thr 
15 10 15 

Ala Lys Asp Ala Leu Ser Ser Val Gin Glu Ser Gin Val Ala Gin Gin 
20 25 30 

Ala Arg Gly Trp Val Thr Asp Gly Phe Ser Ser Leu Lys 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS; 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 134: 

Ala Pro Ala Ser Ala Gin Ser Arg Gly Glu Lys Arg Thr Ala His Asn 
15 10 15 

Ala He Glu Lys Arg Tyr Arg Ser Ser He Asn Asp Lys He He Glu 
20 25 30 

Leu Lys Asp Leu Val Val Gly Thr Glu Ala Lys Leu Asn Lys Ser 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

Asp Tyr Trp Ser Thr Val Lys Asp Lys Phe Ser Glu Phe Trp Asp Leu 
15 10 15 
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Asp Pro Glu Val Arg Pro Thr Ser Ala Val Ala Ala 
20 25 



(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SBQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

Glu lie Tyr Val Ala Ala Ala Leu Arg Val Lys Thr Ser Leu Pro Arg 
15 10 15 

Ala Leu His Phe Leu Thr Arg Phe Phe Leu Ser Ser Ala Arg Gin Ala 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Glu Lys lie Pro Thr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 138: 

Glu Lys Leu Pro lie 
1 5 



(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Glu Asn Gly Arg Cys He Gin Ala Asn Tyr Ser Leu Met Glu Asn Gly 
15 10 15 

Lys He Lys Val Leu Asn Gin Glu Leu Arg Ala Asp Gly 
20 25 



(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: . 

Ala Val Leu Arg Lys Ala He Asp Tyr He Arg Phe Leu Gin His Ser 
15 10 15 

Asn Gin Lys Leu Lys Gin Glu Asn Leu Ser Leu Arg Thr Ala Val 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

Met Lys Gin Leu Glu Asp Lys Val Glu Glu Leu Leu Ser Lys Asn Tyr 
15 10 15 

His Leu Glu Asn Glu Val Ala Arg Leu Lys Lys Leu Val Gly Glu Arg 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 
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(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRAKDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Arg He Gin He Gin Glu Lys Leu Gin Gin Leu Lys Arg His He Gin 
15 10 15 

Asn He Asp He Gin His Leu Ala Gly Lys Leu Lys Gin His He Glu 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

Val Leu Gin Gin Val Lys He Lys Asp Tyr Phe Glu Lys Leu Val Gly 
1 5 10 15 

Phe He Asp Asp Ala Val Lys Lys Leu Asn Glu Leu Ser Phe Lys Thr 
20 25 30 

Phe He Glu 
35 



SEQUENCE DESCRIPTION: SEQ ID NO: 146: 



(2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) 

Glu Leu Ser Phe Lys Thr Phe He 
1 5 

Met Leu He Lys Lys Leu Lys Ser 
20 



Glu Asp Val Asn Lys Phe Leu Asp 
10 15 

Phe Asp Tyr His Gin Phe Val 
25 30 



(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 7: 

His Gin Phe Val Asp Glu Thr Asn Asp Lys lie Arg Glu Val Thr Gin 
15 10 15 

Arg Leu Asn Gly Glu He Gin Ala Leu Glu Leu Pro 
20 25 



(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148; 

Ala Ala Lys Asn Leu Thr Asp Phe Ala Glu Gin Tyr Ser He Gin Asp 
15 10 15 

Trp Ala Lys Arg Met Lys Ala Leu Val Glu Gin Gly Phe Thr Val 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

Ser Ala Ser Leu Ala His Met Lys Ala Lys Phe Arg Glu Thr Leu Glu 
15 10 15 

Asp Thr Arg Asp Arg Met Tyr Asp Met Asp He Gin Gin Glu Leu Gin 
20 25 30 

Arg Tyr Leu 
35 



(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

Ser Ala Ser Leu Ala His Met Lys Ala Lys Phe Arg Glu Thr Leu Glu 
^5 10 15 

Asp Thr Arg Asp Arg Met Tyr Asp Met Asp lie Gin Gin Glu Leu Gin 
20 25 30 

Arg Tyr Leu 
35 



(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

Cys Leu Asn Leu His Lys Phe Asn Glu Phe He Gin Asn Glu Leu Gin 
15 10 15 

Glu Ala Ser Gin Glu Leu Gin Gin He His Gin Tyr He Met Ala Leu 
20 25 30 

Arg Glu Glu 
35 



(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

Phe Leu He Tyr He Thr Glu Leu Leu Lys Lys Leu Gin Ser Thr Thr 
15 10 15 

Val Met Asn Pro Tyr Met Lys Leu Ala Pro Gly Glu Leu Thr He He 
20 25 30 
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(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

Arg Leu Leu Asp His Arg Val Pro Glu Thr Asp Met Thr Phe Arg His 
15 10 15 

Val Gly Ser Lys Leu lie Val Ala Met Ser Ser Trp Leu Gin 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

Leu Asn Phe Ser Lys Leu Glu He Gin Ser Gin Val Asp Ser Gin His 
15 10 15 

Val Gly His Ser Val Leu Thr Ala Lys Gly Met Ala Leu Phe 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

Asn Gin Asn Phe Ser Ala Gly Asn Asn Glu Asn He Met Glu Ala His 
15 10 15 

Val Gly He Asn Gly Glu Ala Asn Leu Asp Phe Leu Asn He 
20 25 30 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

Met Val Val Thr Arg He Ala Pro Ser Pro Thr Gly Asp Pro His Val 
^5 10 15 

Gly Thr Ala Tyr He Ala Leu Phe Asn Tyr Ala Trp Ala 
20 25 



(2) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

Thr Thr Val His Thr Arg Phe Pro Pro Glu Pro Asn Gly Tyr Leu His 
^5 10 15 

He Gly His Ala Lys Ser He Cys Leu Asn Phe Gly He Ala 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

Lys He Lys Leu Tyr Cys Gly Val Asp Pro Thr Ala Gin Ser Leu His 
^5 10 15 

Leu Gly Asn Leu Val Pro Met Val Leu Leu His Phe Tyr Val 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 158: 

Pro lie Ala Leu Tyr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His 
15 10 15 

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Gly Gin 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

Arg Val Thr Leu Tyr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His 
15 10 15 

He Gly Asn Leu Ala Ala He Leu Thr Leu Arg Arg Phe Gin 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 16 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 

Arg He Gly Ala Tyr Val Gly He Asp Pro Thr Ala Pro Ser Leu His 
15 10 15 

Val Gly His Leu Leu Pro Leu Met Pro Leu Phe Trp Met Tyr 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 
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Pro He Ala Leu Tyr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His 
15 10 15 

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gin 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 

Pro Leu Lys Val Lys Leu Gly Ala Asp Pro Thr Ala Pro Asp He His 
15 10 15 

Leu Gly His Thr Val Val Leu Asn Lys Leu Arg Gin Phe Gin 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 

Val Ser Lys Gly Leu Leu He Phe Asp Ala Ser Ser Ser Met Gly Pro 
15 10 15 

Gin Met Ser Ala Ser Val His Leu Asp Ser Lys Lys Lys Gin His Leu 
20 25 30 

Phe Val Lys Glu Val Lys He Asp Gly Gin Phe 
35 40 



(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 

Thr He He Thr Thr Pro Pro Leu Lys Asp Phe Ser Leu Trp Glu Lys 
1 5 10 15 
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Thr Qly Leu Lys Glu Phe Leu Lys 
20 

Ser Val Lys Ala Gin Tyr Lys Lya 
35 40 



Thr Thr Lys Gin Ser Phe Asp Leu 
25 30 

Asn Lys His 



(2) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

Lys Asn Arg Asn Asn Ala Leu Asp Phe Val Thr Lys Ser Tyr Asn Glu 
15 10 15 

Thr Lys lie Lys Phe Asp Lys Tyr Lys Ala Glu Lys Ser Gin Asp Glu 
20 25 30 

Leu Pro Arg Thr Phe Gin lie 
35 



(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHi\RACTERISTICS : 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

Asp Ala Leu Gin Tyr Lys Leu Glu Gly Thr Thr Arg Leu Thr Arg Lys 
1 5 10 15 

Arg Gly Leu Lys Leu Ala Thr Ala Leu Ser Leu Ser Asn Lys Phe Val 
20 25 30 

Glu Gly Ser His 
35 



(2) INFORMATION FOR SEQ ID NO: 167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 

Arg Ala Phe Gly Trp Glu Ala Pro Arg Glu Tyr His Met Pro Leu Leu 
15 10 15 

Arg Asn Pro Asp Lys Thr Lys lie Ser Lys Arg Lys Ser His Thr Ser 
20 25 30 

Leu Asp Trp Tyr Lys Ala Glu Gly Phe Leu 
35 40 



(2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 16 8: 

Asp Asn lie Thr lie Pro Val His Pro Arg Gin Tyr Glu Phe Ser Arg 
15 10 15 

Leu Asn Leu Glu Tyr Thr Val Met Ser Lys Arg Lys Leu Asn Leu Leu 
20 25 30 

Val Thr Asp Lys His Val Glu Gly Trp Asp 
35 40 



SEQUENCE DESCRIPTION: SEQ ID NO: 169: 



(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) 

Lys Asn Lys Gly Leu Pro Phe Gly 
1 5 

Ala Thr Gly Glu Lys Phe Gly Lys 
20 

Asp Pro Ser lie Asn Thr Ala Tyr 
35 40 



lie Thr Val Pro Leu Leu Thr Thr 
10 15 

Ser Ala Gly Asn Ala Val Phe lie 
25 30 



(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQX7ENCE DESCRIPTION: SEQ ID NO: 170: 

Arg Leu His Gin Asn Gin Val Phe Gly Leu Thr Val Pro Leu lie Thr 
1 ' 5 xo 15 

Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu Gly Gly Ala Val Trp 
20 25 30 

Leu Asp Pro Lys Lys Thr Ser Pro Tyr 
35 40 



(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

Lys Thr Lys Gly Glu Ala Arg Ala Phe Gly Leu Thr He Pro Leu Val 
15 10 15 

Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu Ser Gly Thr He 
20 25 30 

Trp Leu Asp Lys Glu Lys Thr Ser Pro Tyr 
35 40 



(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

Lys Thr Ala Leu Asp Glu Cys Val Gly Phe Thr Val Pro Leu Leu Thr 
15 10 15 

Asp Ser Ser Gly Ala Lys Phe Gly Lys Ser Ala Gly Asn Ala He Trp 
20 25 30 

Leu Asp Pro Tyr Gin Thr Ser Val Phe 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

Arg Leu His Gin Asn Gin Val Phe Gly Leu Thr Val Pro Leu lie Thr 
15 10 15 

Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu Gly Gly Ala Val Trp 
20 25 30 

Leu Asp Pro Lys Lys Thr Ser Pro Tyr 
35 40 



SEQUENCE DESCRIPTION: SEQ ID NO: 174: 



(2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) 

Ser Ala Gly Lys Lys Pro Gin Val 
1 5 

Gly Leu Asp Gly Glu Lys Lys Met 
20 

Gly Val Thr Glu Ala Pro Ser Asp 
35 40 



Ala lie Thr Leu Pro Leu Leu Val 
10 15 

Ser Lys Ser Leu Gly Asn Tyr lie 
25 30 

Met Phe 



(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 

Arg Val Ser Thr Ala Phe Val Tyr Thr Lys Asn Pro Asn Gly Tyr Ser 
15 10 15 

Phe Ser lie Pro Val Lys Val Leu Ala Asp Lys Phe lie Thr Pro Gly 
20 25 30 
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Leu Lys Leu 
35 



(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

Lys Leu Gly Gin Gly Cys Phe Gly Glu Val Trp Met Gly Thr Trp Asn 
15 10 IS 

Gly Thr Thr Arg Val Ala He Lys Thr Leu Lys Pro Gly 
20 25 



(2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

His He Gly His 
1 



(2) INFORMATION FOR SEQ ID NO: 178: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

His Lys Asn Thr Ser Thr Leu Ser Cys Asp Gly Ser Leu Arg His Lys 
15 10 15 

Phe 



(2) INFORMATION FOR SEQ ID NO: 179: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

Arg Lys Leu Lys His He Asn He Asp Gin Phe Val Arg Lys Tyr Arg 
15 10 15 

Ala 



(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 

Arg His He Gin Asn He Asp He Gin His Leu Ala Gly Lys Leu Lys 
15 10 15 

Gin His 



(2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 

Lys Lys Gly Phe Tyr Lys Lys Lys Gin Cys Arg Pro Ser Lys Gly Arg 
15 10 15 

Lys 



(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 
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Lys Lys Pro Leu Asp Gly Glu Tyr Phe Thr Leu Gin lie Arg Gly Arg 
15 10 15 

Glu Arg 



(2) INFORMATION FOR SEQ ID NO: 183: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNBSS : 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

Lys Arg Ala Leu Pro Asn Asn Thr Ser Ser Ser Pro Gin Pro Lys Lys 
15 10 15 

Lys 



(2) INFORMATION FOR SEQ ID NO: 184: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

Lys Lys Thr Asn Leu Phe Ser Ala Leu lie Lys Lys Lys Lys Lys Thr 
15 10 15 

Ala 



(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

Arg Lys Thr Leu Leu Asn Ser Leu Glu Glu Ala Lys Lys Lys Lys Glu 
15 10 15 

Asp 
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(2) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

Arg Arg Glu Leu Asp Glu Ser Leu Gin Val Ala Glu Arg Leu Thr Arg 
15 10 15 

Lys 



(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

Arg Arg Ser Tyr Ala Leu Val Ser Leu Ser Phe Phe Arg Lys Leu Arg 
15 10 15 

Leu 



(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

Arg Arg Tyr Gly Asp Glu Glu Leu His Leu Cys Val Ser Arg Lys His 
15 10 15 

Phe 
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(2) INFORMATION FOR SEQ ID NO: 189: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

Lys Arg Val Ala Lys Arg Lys Leu lie Glu Gin Asn Arg Glu Arg Arg 
15 10 15 

Arg 



(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

His Arg Ser Thr Asn Ala Gin Gly Ser His Trp Lys Gin Arg Arg Lys 
15 10 15 

Phe 



(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

Lys Arg Pro Pro lie Ser Asp Ser Glu Glu Leu Ser Ala Lys Lys Arg 
15 10 15 

Lys 



(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS ; 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

Lys Lys Gly Lys Lys Pro Lys Thr Glu Lys Glu Asp Lys Val Lys His 
15 10 15 

lie 



(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

Arg Lys Arg Met Arg Asn Arg lie Ala Ala Ser Lys Cys Arg Lys Arg 
15 10 15 

Lys 



(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

Arg His He Gin Asn He Asp He Gin His Leu Ala Gly Lys Leu Lys 
15 10 15 

Gin His 



(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 
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Lys Lys He Thr Glu Val Ala Leu Met Gly His Leu Ser Cys Asp Thr 
•1 5 10 15 

Lys Glu Glu Arg Lys 
20 



(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 

Lys His He Asn He Asp Gin Phe Val Arg Lys Tyr Arg Ala 
15 10 



(2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ IDNO: 197: 

His Arg Asn He Gin Glu Tyr Leu Ser He Leu Thr Asp Pro Asp Gly 
15 10 15 

Lys Gly Lys Glu Lys 
20 



(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Lys Glu Val Tyr Gly Phe Asn Pro Glu Gly Lys Ala Leu Leu Lys Lys 
15 10 15 

Thr Lys 
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(2) INFORMATION FOR SEQ ID NO : 199: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 

Lys Val Leu Val Asp His Phe Gly Tyr Thr Lys Asp Asp Lys His Glu 
15 10 15 

Asp Met 



(2) INFORMATION FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 

Lys Ala Gly Lys Leu Lys Phe lie lie Pro Ser Pro Lys Arg Pro Val 
15 10 15 

Lys Leu 



(2) INFORMATION FOR SEQ ID NO : 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 

Arg Gin Val Ser His Ala Lys Glu Lys Leu Thr Ala Leu Thr Lys Lys 
15 10 15 

Tyr Arg 



(2) INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 
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(B) TYPE: amino acid 
<C) STRAMDEDNESS : 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 

Lys Tyr Gin lie Arg lie Gin He Gin Glu Lys Leu Gin Gin Leu Lys 
15 10 15 

Arg His 



(2) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203: 

Lys Gly Met Ala Leu Phe Gly Glu Gly Lys Ala Glu Phe Thr Gly Arg 
15 10 15 

His Asp Ala His 
20 



(2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 

Lys Gin Ser Phe Asp Leu Ser Val Lys Ala Gin Tyr Lys Lys Asn Lys 
15 10 15 

His Arg 



(2) INFORMATION FOR SEQ ID NO: 205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 
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Lys Leu Glu Gly Thr Thr Arg Leu Thr Arg Lys Arg Gly Leu Lys 
15 10 IS 



(2) INFORMATION FOR SEQ ID NO: 206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 

Lys Leu Asp Val Thr Thr Ser lie Gly Arg Arg Gin His Leu Arg 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 207: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ* ID NO: 207: 

Lys Leu Asp Phe Arg Glu He Gin He Tyr Lys Lys Leu Arg 
15 10 



(2) INFORMATION FOR SEQ ID NO: 208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 

Lys Ser Pro Ala Thr Asp Leu His Leu Arg Tyr Gin Lys Asp Lys Lys 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 209: 
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Lys Tyr His Trp Glu His Thr Gly Leu Thr Leu Arg Glu Val Ser Ser 
15 10 15 

Lys Leu Arg Arg 
20 



(2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear ' 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 210: 

Lys Asp Asn Val Phe Asp Gly Leu Val Arg Val Thr Gin Lys Phe His 
15 10 15 

Met Lys Val Lys His 
20 



(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 

Ser lie Asn Leu Pro Phe Phe Glu Thr Leu Gin Glu Tyr Phe Glu Arg 
15 10 15 

Asn Arg Gin Thr lie lie Val Val Leu Glu Asn Val Gin Arg Lys Leu 
20 25 30 

Lys His lie Asn lie Asp Gin Phe Val Arg Lys Tyr Arg Ala Ala Leu 
35 40 45 

Gly Lys Leu Pro Gin Gin Ala Asn Asp Tyr Leu Asn Ser Phe Asn Trp 
50 55 60 

Glu Arg Gin Val Ser His Ala Lys Glu Lys Leu Thr Ala Leu Thr Lys 
65 70 75 80 

Lys Tyr Arg lie Thr Glu Asn Asp lie Gin lie Ala Leu Asp Asp Ala 
85 90 95 

Lys lie Asn Phe Asn Glu Lys Leu Ser Gin Leu Gin Thr Tyr Met lie 
100 105 110 
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Gin Phe Asp Gin Tyr 
115 

lie Ala lie Ala Asn 
130 

Leu Asp Glu His Tyr 
145 

Asp Leu His Leu Phe 
165 

Ser Thr Ala Ser 
180 



lie Lys Asp Ser Tyr 
120 

He He Asp Glu He 
135 

His He Arg Val He 
150 

He Glu Asn He Asp 
170 



Asp Leu His Asp Leu Lys 
125 

He Glu Lys Leu Lys Ser 
140 

Leu Val Lys Thr He His 
155 160 

Phe Asn Lys Ser Gly Ser 
175 



(2) INFORMATION FOR SEQ ID NO: 212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 94 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 



Pro Gin Gin Val Asn Asp Tyr Leu 
1 5 

Val Leu Ser Ala Lys Lys Lys His 
20 

He Thr Glu Asn Asp Val Arg He 
35 40 



Ser Thr Phe Ser Trp Glu Arg Gin 
10 15 

Ser Asp Phe Met Glu Asp Tyr Arg 
25 30 

Ala Leu Asp Asn Ala Lys He Asn 
45 



Leu Asn Glu Lys Leu Thr Gin Leu Gin Thr Tyr Val He Gin Phe Asp 
50 55 60 

Gin Tyr He Lys Asp Asn Tyr Asp Leu His Asp Phe Lys Thr Ala He 
65 70 75 80 

Ala Arg He He Asp Glu He He Ala Thr Leu Lys He Leu 
85 90 



(2) INFORMATION FOR SEQ ID NO: 213: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 5 amino acids 
(B> TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213: 
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Lys Tyr Arg Val Ala Leu Ser Arg 
1 5 

Leu Asn Ala Ser Asp Trp Glu Arg 
20 

Leu Thr Ser Phe Met Glu Asn Tyr 
35 40 

lie Ala Leu Asp Ser Ala Lys lie 
50 55 

Leu Glu Thr Tyr Ala lie Gin Phe 
65 70 

Asp Ala Gin Asp Leu 
85 



197- 

Leu Pro Gin Gin lie His Asp Tyr 
10 15 

Gin Val Ala Gly Ala Lys Glu Lys 
25 30 

Arg lie Thr Asp Asn Asp Val Leu 
45 

Asn Leu Asn Glu Lys Leu Ser Gin 
60 

Asp Gin Tyr lie Arg Asp Asn Tyr 
75 80 



(2) INFORMATION FOR SEQ ID NO: 214: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214: 

Leu Asn Asp Phe Gin Val Pro Asp Leu His He Pro Glu Phe Gin Leu 
15 10 15 

Pro His He Ser His Thr He Glu Val Pro Thr Phe Gly Lys Leu Tyr 
20 25 30 

Ser He Leu Lys He Gin Ser Pro Leu Phe Thr Leu Asp Ala Asn Ala 
35 40 45 

Asp He Gly Asn Gly Thr Thr Ser Ala Asn Glu Ala Gly He Ala Ala 
50 55 60 

Ser He Thr Ala Lys Gly Glu Ser Lys Leu Glu Val Leu Asn Phe Asp 
65 70 75 80 

Phe Gin Ala Asn Ala Gin Leu Ser Asn Pro Lys He Asn Pro Leu Ala 
85 90 95 

Leu Lys Glu Ser Val Lys Phe Ser Ser Lys Tyr Leu Arg Thr Glu His 
100 105 110 

Gly Ser Glu Met Leu Phe Phe Gly Asn Ala He Glu Gly Lys Ser Asn 
115 120 125 

Thr Val Ala Ser Leu His Thr Glu Lys Asn Thr Leu Glu Leu Ser Asn 
130 135 140 
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Gly Val lie Val Lys lie Asn Asn Gin Leu Thr Leu Asp Ser Asn Thr 
145 150 155 160 

Lys Tyr Phe His Lys Leu Asn lie Pro Lys Leu Asp Phe Ser Ser Gin 
165 170 175 

Ala Asp Leu Arg Asn Glu He Lys Thr Leu Leu Lys Ala Gly His He 
180 185 190 

Ala Trp Thr Ser Ser Gly Lys Gly Ser Trp Lys Trp Ala Cys Pro Arg 
195 200 205 

Phe Ser Asp Glu Gly Thr His Glu Ser Gin He Ser Phe Thr He Glu 
210 215 220 

Gly Pro Leu Thr Ser Phe Gly Leu Ser Asn Lys He Asn Ser Lys His 
225 230 235 240 

Leu Arg Val Asn Gin Asn Leu Val Tyr Glu Ser Gly Ser Leu Asn Phe 
245 250 255 

Ser Lys Leu Glu He Gin Ser Gin Val Asp Ser Gin His Val Gly His 
260 265 270 

Ser Val Leu Thr Ala Lys Gly Met Ala Leu Phe Gly Glu Gly Lys Ala 
275 280 285 

Glu Phe Thr Gly Arg His Asp Ala His Leu Asn Gly Lys Val He Gly 
290 295 300 

Thr Leu Lys Asn Ser Leu Phe Phe Ser Ala Gin Pro Phe Glu He Thr 
305 310 315 320 

Ala Ser Thr Asn Asn Glu Gly Asn Leu Lys Val Arg Phe Pro Leu Arg 
325 330 335 

Leu Thr Gly Lys He Asp Phe Leu Asn Asn Tyr Ala Leu Phe Leu Ser 
340 345 350 

Pro Ser Ala Gin Gin Ala Ser Trp Gin Val Ser Ala Arg Phe Asn Gin 
355 360 365 

Tyr Lys Tyr Asn Gin Asn Phe Ser Ala Gly Asn Asn Glu Asn He Met 
370 375 380 

Glu Ala His Val Gly He Asn Gly Glu Ala Asn Leu Asp Phe Leu Asn 
385 390 395 400 

He Pro Leu Thr He Pro Glu Met Arg Leu Pro Tyr Thr He He Thr 
405 410 415 

Thr Pro Pro Leu Lys Asp Phe Ser Leu Trp Glu Lys Thr Gly Leu Lys 
420 425 430 



Glu Phe Leu Lys Thr Thr Lys Gin Ser Phe Asp Leu Ser Val Lys Ala 
435 440 445 
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Gin Tyr Lys Lys Asn Lys His Arg His Ser lie Thr Asn Pro Leu Ala 
450 455 460 

Val Leu Cys Glu Phe lie Ser Gin Ser lie Lys Ser Phe Asp Arg His 
465 470 475 480 

Phe Glu Lys Asn Arg Asn Asn Ala Leu Asp Phe Val Thr Lys Ser Tyr 
485 490 495 

Asn Glu Thr Lys lie Lys Phe Asp Lys Tyr Lys Ala Glu Lys Ser Gin 
500 505 510 

Asp Glu Leu Pro Arg Thr Phe Gin lie Pro Gly Tyr Thr Val Pro Val 
515 520 525 

Val Asn Val Glu Val Ser Pro Phe Thr lie Glu Met Ser Ala Phe Gly 
530 535 540 

Tyr Val Phe Pro Lys Ala Val Ser Met Pro Ser Phe Ser He Leu Gly 
545 550 555 560 

Ser Asp Val Arg Val Pro Ser Tyr Thr Leu He Leu Pro Ser Leu Glu 
565 570 575 

Leu Pro Val Leu His Val Pro Arg Asn Leu Lys Leu Ser Leu Pro His 
580 585 590 

Phe Lys Glu Leu Cys Thr He Ser His He Phe He Pro Ala Met Gly 
595 600 605 

Asn He Thr Tyr Asp Phe Ser Phe Lys Ser Ser Val He Thr Leu Asn 
610 615 620 

Thr Asn Ala Glu Leu Phe Asn Gin Ser Asp He Val Ala His Leu Leu 
625 630 635 640 

Ser Ser Ser Ser Ser Val He Asp Ala Leu Gin Tyr Lys Leu Glu Gly 
645 650 655 

Thr Thr Arg Leu Thr Arg Lys Arg Gly Leu Lys Leu Ala Thr Ala Leu 
660 665 670 

Ser Leu Ser Asn Lys Phe Val Glu Gly Ser His Asn Ser Thr Val Ser 
675 680 685 

Leu Thr Thr Lys Asn Met Glu Val Ser Val Ala Lys Thr Thr Lys Ala 
690 695 700 

Glu He Pro He Leu Arg Met Asn Phe Lys Gin Glu Leu Asn Gly Asn 
705 710 715 720 



Thr Lys Ser Lys Pro Thr Val Ser Ser Ser Met Glu Phe Lys Tyr Asp 
725 730 735 
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Phe Asn Ser Ser Met Leu Tyr Ser Thr Ala Lys Gly Ala Val Asp His 
740 745 750 

Lys Leu Ser Leu Glu Ser Leu Thr Ser Tyr Phe Ser lie Glu Ser Ser 
755 760 765 

Thr Lys Gly Asp Val Lys Gly Ser Val Leu Ser Arg Glu Tyr Ser Gly 
770 775 780 

Thr lie Ala Ser Glu Ala Asn Thr Tyr Leu Asn Ser Lys Ser Thr Arg 
785 790 795 800 

Ser Ser Val Lys Leu Gin Gly Thr Ser Lys lie Asp Asp He Trp Asn 
805 810 815 

Leu Glu Val Lys Glu Asn Phe Ala Gly Glu Ala Thr Leu Gin Arg He 
820 825 830 

Tyr Ser Leu Trp Glu His Ser Thr 
835 840 

(2) INFORMATION FOR SEQ ID NO: 215: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 774 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 215: 

Glu Phe Gin Leu Pro Arg Leu Ser His Thr He Glu He Pro Ala Phe 
15 10 15 

Gly Arg Leu His Gly He Leu Lys He Gin Ser Pro Leu Phe He Leu 
20 25 30 

Asp Ala Asn Ala Asn He Gin Asn Val Thr Thr Leu Glu Asn Lys Ala 
35 40 45 

Glu He Val Ala Ser He Ala Ala Thr Gly Glu Ser Glu He Glu Ala 
50 55 60 

Leu Asn Phe Asp Phe Gin Ala Gin Ala Gin Phe Leu Glu Leu Asn Pro 
65 70 75 80 

Asn Pro Leu He Leu Lys Glu Ser Met Asn Phe Ser Ser Lys His Ala 
85 90 95 

Arg Met Glu His Glu Gly Glu He Leu Phe Ser Gly Lys Phe He Glu 
100 105 110 

Gly Lys Leu Asp Thr Val Ala Ser Leu Gin Thr Glu Lys Asn Met Val 
115 120 125 
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Glu Phe Asn Asn Gly Met lie Val Lys lie Asn Asn Pro lie lie Leu 
130 135 140 

Asp Ser His Thr Lys Tyr Phe His Lys Leu Ser lie Pro Arg Leu Asp 
145 150 155 160 

Phe Ser Ser Lys Ala Ser Phe Asn Asn Glu lie Lys Met Leu Leu Glu 
165 170 175 

Ala Gly His Val Ala Trp Thr Ser Ser Gly Thr Gly Ser Trp Asn Trp 
180 185 190 

Ala Cys Pro Asn Phe Ser Asp Glu Gly Thr His Ser Ser Lys lie Ser 
195 200 205 

Phe Thr Val Glu Gly Pro lie Ala Phe Phe Gly Leu Ser Asn Asn He 
210 215 220 

Asn Gly Lys His Leu Arg Val He Gin Lys Leu Ala Tyr Glu Ser Gly 
225 230 235 240 

Phe Leu Asn Tyr Ser Met Leu Glu Val Glu Ser Lys Val Glu Ser Gin 
245 250 255 

His Val Gly Ser Ser He Leu Thr Gly Lys Gly Thr Val Leu Leu Arg 
260 265 270 

Glu Ala Lys Ala Glu Met Thr Gly Glu His Asn Ala Asp Leu Asn Gly 
275 280 285 

Lys Val He Gly Thr Leu Lys Asn Ser Leu Ser Phe Ser Ala Gin Pro 
290 295 300 

Phe Met He Thr Ala Ser Thr Asn Asn Asp Gly Asn Leu Lys Val Ser 
305 310 315 320 

Phe Pro Leu Lys Leu Thr Gly Lys He Asp Phe Leu Asn Asn Tyr Ala 
325 330 335 

Leu Phe Leu Ser Pro His Ala Gin Gin Ala Ser Trp Gin Val Ser Ala 
340 345 350 

Arg Phe Asn Gin Tyr Lys Tyr Asn Gin Asn Phe Ser Ala He Asn Asn 
355 360 365 

Glu His Asn He Glu Ala His Val Gly Met Asn Gly Asp Ala Asn Leu 
370 375 380 

Asp Phe Leu Thr He Pro Leu Thr He Pro Glu Val Lys Leu Pro Tyr 
385 390 395 400 

He Gly Leu Thr Thr Pro Leu Leu Lys Asp Phe Ser He Trp Glu Glu 
405 410 415 

Thr Gly Leu Lys Lys Gin Ser Phe Asp Leu Ser Val Lys Ala Gin Tyr 
420 425 430 
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Lys Lys Asn Arg Asp Arg His Ser He Ala He Pro Leu Asn Gly Phe 
435 440 445 

Tyr Glu Phe He Leu Asn Asn Val Asp Ser Gly He Gly Lys He Gly 
450 455 460 

Lys Val Arg Asp Ser Ala Leu Asp Tyr Leu He Ser Ser Tyr Asn Glu 
465 470 475 480 

Ala Lys Asn Lys Phe Glu Asn Ser Leu He Gin Pro Ser Arg Thr Phe 
485 490 495 

Gin Lys Arg Gly Tyr Thr He Pro Phe Val Asn He Glu Val Thr Pro 
500 505 510 

Phe Thr Val Glu Thr Leu Ala Ser Ser His Val He Pro Lys Ala He 
515 520 525 

Asn Thr Pro Ser Val His He Leu Gly Pro Asn Val He Val Pro Ser 
530 535 540 

Tyr Arg Leu Val Leu Pro Ser Leu Glu Leu Pro Val Leu Arg Val Pro 
545 550 555 560 

Arg Asn Leu Leu Lys Phe Ser Leu Pro Asp Phe Lys Glu Leu Arg Thr 
565 570 575 

He Asp Asn He Tyr He Pro Ala Leu Gly Asn Phe Thr Tyr Asp Phe 
580 585 590 

Ser Phe Lys Ser Ser Val He Thr Leu Asn Thr Asn Val Gly Leu Tyr 
595 600 605 

Asn Arg Ser Asp He Val Ala His Phe Leu Ser Ser Ser Ser Phe Val 
610 615 620 

Thr Asp Ala Leu Gin Tyr Lys Leu Glu Gly Thr Ser Arg Leu Thr Arg 
625 630 635 640 

Lys Arg Gly Leu Lys Leu Ala Thr Ala Asp Ser Leu Thr Asn Lys Phe 
645 650 655 

Val Lys Gly Asn His Asp Ser Thr Phe Ser Leu Thr Lys Lys Asn Met 
660 665 670 

Glu Ala Ser Val Lys Thr Thr Ala Asn Leu His Ala Pro He Leu Thr 
675 680 685 

Met Asn Phe Lys Gin Glu Leu Asn Gly Asn Ala Lys Ser Lys Pro He 
690 695 700 

Val Ser Ser Ser He Glu Leu Asn Tyr Asp Phe Asn Ser Ser Lys Leu 
705 710 715 720 
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Tyr Ser Thr Ala Lys Gly Gly Val Asp His Lys Phe Ser Leu Glu Ser 
725 730 735 

Leu Thr Ser Tyr Phe Ser lie Glu Ser Ser Thr Lys Gly Asn He Lys 
740 745 750 

Gly Ser Val Leu Ser Gin Glu Tyr Ser Gly Ser Val Ala Ser Glu Ala 
755 760 765 

Asn Thr Tyr Leu Asn Ser 
770 



(2) INFORMATION FOR SEQ ID NO: 216: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 785 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 216: 

Glu Phe Gin Leu Pro His Leu Ser His Thr He Glu He Pro Ala Phe 
15 10 15 

Gly Lys Leu His Ser He Leu Lys He Gin Ser Pro Leu Phe He Leu 
20 25 30 

Asp Ala Asn Ala Asn He Gin Asn Val Thr Thr Ser Gly Asn Lys Ala 
35 40 45 

Glu He Val Ala Ser Val Thr Ala Lys Gly Glu Ser Gin Phe Glu Ala 
50 55 60 

Leu Asn Phe Asp Phe Gin Ala Gin Ala Gin Phe Leu Glu Leu Asn Pro 
65 70 75 80 

His Pro Pro Val Leu Lys Glu Ser Met Asn Phe Ser Ser Lys His Val 
85 90 95 

Arg Met Glu His Glu Gly Glu He Val Phe Asp Gly Lys Ala He Glu 
100 105 110 

Gly Lys Ser Asp Thr Val Ala Ser Leu His Thr Glu Lys Asn Glu Val 
115 120 125 

Glu Phe Asn Asn Gly Met Thr Val Lys Val Asn Asn Gin Leu Thr Leu 
130 135 140 

Asp Ser His Thr Lys Tyr Phe His Lys Leu Ser Val Pro Arg Leu Asp 
145 150 155 160 

Phe Ser Ser Lys Ala Ser Leu Asn Asn Glu He Lys Thr Leu Leu Glu 
165 170 175 
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Ala Gly His Val Ala Leu Thr Ser Ser Gly Thr Gly Ser Trp Asn Trp 
180 185 190 

Ala Cys Pro Asn Phe Ser Asp Glu Gly lie His Ser Ser Gin lie Ser 
195 200 205 

Phe Thr Val Asp Gly Pro lie Ala Phe Val Gly Leu Ser Asn Asn lie 
210 215 220 



Asn Gly Lys His Leu Arg Val lie Gin Lys Leu Thr Tyr Glu Ser Gly 
225 230 235 240 

Phe Leu Asn Tyr Ser Lys Phe Glu Val Glu Ser Lys Val Glu Ser Gin 
245 250 255 

His Val Gly Ser Ser lie Leu Thr Ala Asn Gly Arg Ala Leu Leu Lys 

260 265 270 



Asp Ala Lys Ala Glu Met Thr Gly Glu His Asn Ala Asn Leu Asn Gly 
275 280 285 

Lys Val He Gly Thr Leu Lys Asn Ser Leu Phe Phe Ser Ala Gin Pro 
290 295 300 

Phe Glu He Thr Ala Ser Thr Asn Asn Glu Gly Asn Leu Lys Val Gly 
305 310 315 320 



Phe Pro Leu Lys Leu Thr Gly Lys He Asp Phe Leu Asn Asn Tyr Ala 
325 330 335 

Leu Phe Leu Ser Pro Arg Ala Gin Gin Ala Ser Trp Gin Ala Ser Thr 
340 345 350 

Arg Phe Asn Gin Tyr Lys Tyr Asn Gin Asn Phe Ser Ala He Asn Asn 
355 360 365 

Glu His Asn He Glu Ala Ser He Gly Met Asn Gly Asp Ala Asn Leu 
370 375 380 

Asp Phe Leu Asn He Pro Leu Thr He Pro Glu He Asn Leu Pro Tyr 
365 390 395 400 

Thr Glu Phe Lys Thr Pro Leu Leu Lys Asp Phe Ser He Trp Glu Glu 
405 410 415 

Thr Gly Leu Lys Glu Phe Leu Lys Thr Thr Lys Gin Ser Phe Asp Leu 
420 425 430 

Ser Val Lys Ala Gin Tyr Lys Lys Asn Ser Asp Lys His Ser He Val 
435 440 445 

Val Pro Leu Gly Met Phe Tyr Glu Phe He Leu Asn Asn Val Asn Ser 
450 455 460 



Trp Asp Arg Lys Phe Glu Lys Val Arg Asn Asn Ala Leu His Phe Leu 
465 470 475 480 
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Thr Thr Ser Tyr Asn Glu Ala Lys He Lys Val Asp Lys Tyr Lys Thr 
485 490 495 

Glu Asn Ser Leu Asn Gin Pro Ser Gly Thr Phe Gin Asn His Gly Tyr 
500 5.05 510 

Thr He Pro Val Val Asn He Glu Val Ser Pro Phe Ala Val Glu Thr 
515 520 525 

Leu Ala Ser Arg His Val He Pro Thr Ala He Ser Thr Pro Ser Val 
530 535 540 

Thr He Pro Gly Pro Asn He Met Val Pro Ser Tyr Lys Leu Val Leu 
545 550 555 560 

Pro Pro Leu Glu Leu Pro Val Phe His Gly Pro Gly Asn Leu Phe Lys 
565 570 575 

Phe Phe Leu Pro Asp Phe Lys Gly Phe Asn Thr He Asp Asn He Tyr 
580 585 590 

He Pro Ala Met Gly Asn Phe Thr Tyr Asp Phe Ser Phe Lys Ser Ser 
595 600 605 

Val He Thr Leu Asn Thr Asn Ala Gly Leu Tyr Asn Gin Ser Asp He 
610 615 620 

Val Ala His Phe Leu Ser Ser Ser Ser Phe Val Thr Asp Ala Leu Gin 
625 630 635 640 

Tyr Lys Leu Glu Gly Thr Ser Arg Leu Met Arg Lys Arg Gly Leu Lys 
645 650 655 

Leu Ala Thr Ala Val Ser Leu Thr Asn Lys Phe Val Lys Gly Ser His 
660 665 670 

Asp Ser Thr He Ser Leu Thr Lys Lys Asn Met Glu Ala Ser Val Arg 
675 680 685 

Thr Thr Ala Asn Leu His Ala Pro He Phe Ser Met Asn Phe Lys Gin 
690 695 700 

Glu Leu Asn Gly Asn Thr Lys Ser Lys Pro Thr Val Ser Ser Ser He 
705 710 715 720 

Glu Leu Asn Tyr Asp Phe Asn Ser Ser Lys Leu His Ser Thr Ala Thr 
725 730 735 

Gly Gly He Asp His Lys Phe Ser Leu Glu Ser Leu Thr Ser Tyr Phe 
740 745 750 

Ser He Glu Ser Phe Thr Lys Gly Asn He Lys Ser Ser Phe Leu Ser 
755 760 765 
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Gin Glu Tyr Ser Gly Ser Val Ala Asn Glu Ala Asn Val Tyr Leu Asn 
770 775 780 

Ser 
785 



(2) INFORMATION FOR SEQ ID NO: 217: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1056 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 

Glu Tyr Ser Gly Thr He Ala Ser Glu Ala Asn Thr Tyr Leu Asn Ser 
15 10 15 

Lys Ser Thr Arg Ser Ser Val Lys Leu Gin Gly Thr Ser Lys He Asp 
20 25 30 

Asp He Trp Asn Leu Glu Val Lys Glu Asn Phe Ala Gly Glu Ala Thr 
35 40 45 

Leu Gin Arg He Tyr Ser Leu Trp Glu His Ser Thr Lys Asn His Leu 
50 55 60 

Gin Leu Glu Gly Leu Phe Phe Thr Asn Gly Glu His Thr Ser Lys Ala 
65 70 75 80 

Thr Leu Glu Leu Ser Pro Trp Gin Met Ser Ala Leu Val Gin Val His 
85 90 95 

Ala Ser Gin Pro Ser Ser Phe His Asp Phe Pro Asp Leu Gly Gin Glu 
100 105 110 

Val Ala Leu Asn Ala Asn Thr Lys Asn Gin Lys He Arg Trp Lys Asn 
H5 120 125 

Glu Val Arg He His Ser Gly Ser Phe Gin Ser Gin Val Glu Leu Ser 
130 135 140 

Asn Asp Gin Glu Lys Ala His Leu Asp He Ala Gly Ser Leu Glu Gly 
145 150 155 160 

His Leu Arg Phe Leu Lys Asn He He Leu Pro Val Tyr Asp Lys Ser 
165 170 175 

Leu Trp Asp Phe Leu Lys Leu Asp Val Thr Thr Ser He Gly Arg Arg 
180 185 190 

Gin His Leu Arg Val Ser Thr Ala Phe Val Tyr Thr Lys Asn Pro Asn 
195 200 205 
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Gly Tyr Ser Phe Ser lie Pro Val Lys Val Leu Ala Asp Lys Phe lie 
210 215 220 

Thr Pro Gly Leu Lys Leu Asn Asp Leu Asn Ser Val Leu Val Met Pro 
225 230 235 240 

Thr Phe His Val Pro Phe Thr Asp Leu Gin Val Pro Ser Cys Lys Leu 
245 250 255 

Asp Phe Arg Glu He Gin He Tyr Lys Lys Leu Arg Thr Ser Ser Phe 
260 265 270 

Ala Leu Asn Leu Pro Thr Leu Pro Glu Val Lys Phe Pro Glu Val Asp 
275 280 285 

Val Leu Thr Lys Tyr Ser Gin Pro Glu Asp Ser Leu He Pro Phe Phe 
290 295 300 

Glu He Thr Val Pro Glu Ser Gin Leu Thr Val Ser Arg Phe Thr Leu 
305 310 315 320 

Pro Lys Ser Val Ser Asp Gly He Ala Ala Leu Asp Leu Asn Ala Val 
325 330 335 

Ala Asn Lys He Ala Asp Phe Glu Leu Pro Thr He He Val Pro Glu 
340 345 350 

Gin Thr He Glu He Pro Ser He Lys Phe Ser Val Pro Ala Gly He 
355 360 365 

Val He Pro Ser Phe Gin Ala Leu Thr Ala Arg Phe Glu Val Asp Ser 
370 375 380 

Pro Val Tyr Asn Ala Thr Trp Ser Ala Ser Leu Lys Asn Lys Ala Asp 
385 390 395 400 

Tyr Val Glu Thr Val Leu Asp Ser Thr Cys Ser Ser Thr Val Gin Phe 
405 410 415 

Leu Glu Tyr Glu Leu Asn Val Leu Gly Thr His Lys He Glu Asp Gly 
420 425 430 

Thr Leu Ala Ser Lys Thr Lys Gly Thr Leu Ala His Arg Asp Phe Ser 
435 440 445 

Ala Glu Tyr Glu Glu Asp Gly Lys Phe Glu Gly Leu Gin Glu Trp Glu 
450 455 460 

Gly Lys Ala His Leu Asn He Lys Ser Pro Ala Phe Thr Asp Leu His 
465 470 475 480 

Leu Arg Tyr Gin Lys Asp Lys Lys Gly He Ser Thr Ser Ala Ala Ser 
485 490 495 

Pro Ala Val Gly Thr Val Gly Met Asp Met Asp Glu Asp Asp Asp Phe 
500 505 510 
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Ser Lys Trp Asn Phe Tyr Tyr Ser Pro Gin Ser Ser Pro Asp Lys Lys 
515 520 525 

Leu Thr lie Phe Lys Thr Glu Leu Arg Val Arg Glu Ser Asp Glu Glu 
530 535 540 

Thr Gin lie Lys Val Asn Trp Glu Glu Glu Ala Ala Ser Gly Leu Leu 
545 550 555 560 

Thr Ser Leu Lys Asp Asn Val Pro Lys Ala Thr Gly Val Leu Tyr Asp 
565 570 575 

Tyr Val Asn Lys Tyr His Trp Glu His Thr Gly Leu Thr Leu Arg Glu 
580 585 590 

Val Ser Ser Lys Leu Arg Arg Asn Leu Gin Asn Asn Ala Glu Trp Val 
595 600 605 

Tyr Gin Gly Ala lie Arg Gin He Asp Asp He Asp Val Arg Phe Gin 
610 615 620 

Lys Ala Ala Ser Gly Thr Thr Gly Thr Tyr Gin Glu Trp Lys Asp Lys 
625 630 635 640 

Ala Gin Asn Leu Tyr Gin Glu Leu Leu Thr Gin Glu Gly Gin Ala Ser 
645 650 655 

Phe Gin Gly Leu Lys Asp Asn Val Phe Asp Gly Leu Val Arg Val Thr 
660 665 670 

Gin Lys Phe His Met Lys Val Lys His Leu He Asp Ser Leu He Asp 
675 680 685 

Phe Leu Asn Phe Pro Arg Phe Gin Phe Pro Gly Lys Pro Gly He Tyr 
690 695 700 

Thr Arg Glu Glu Leu Cys Thr Met Phe He Arg Glu Val Gly Thr Val 
705 710 715 720 

Leu Ser Gin Val Tyr Ser Lys Val His Asn Gly Ser Glu He Leu Phe 
725 730 735 

Ser Tyr Phe Gin Asp Leu Val He Thr Leu Pro Phe Glu Leu Arg Lys 
740 745 750 

His Lys Leu He Asp Val He Ser Met Tyr Arg Glu Leu Leu Lys Asp 
755 760 765 

Leu Ser Lys Glu Ala Gin Glu Val Phe Lys Ala He Gin Ser Leu Lys 
770 775 780 

Thr Thr Glu Val Leu Arg Asn Leu Gin Asp Leu Leu Gin Phe He Phe 
785 790 795 800 
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Gln Leu lie Glu Asp Asn lie Lys Gin Leu Lys Glu Met Lys Phe Thr 
805 810 815 

Tyr Leu lie Asn Tyr lie Gin Asp Glu lie Asn Thr lie Phe Asn Asp 
820 825 830 

Tyr lie Pro Tyr Val Phe Lys Leu Leu Lys Glu Asn Leu Cys Leu Asn 
835 840 845 

Leu His Lys Phe Asn Glu Phe lie Gin Asn Glu Leu Gin Glu Ala Ser 
850 855 860 

Gin Glu Leu Gin Gin He His Gin Tyr He Met Ala Leu Arg Glu Glu 
865 870 875 880 

Tyr Phe Asp Pro Ser He Val Gly Trp Thr Val Lys Tyr Tyr Glu Leu 
885 890 895 

Glu Glu Lys He Val Ser Leu He Lys Asn Leu Leu Val Ala Leu Lys 
900 905 910 

Asp Phe His Ser Glu Tyr He Val Ser Ala Ser Asn Phe Thr Ser Gin 
915 920 925 

Leu Ser Ser Gin Val Glu Gin Phe Leu His Arg Asn He Gin Glu Tyr 
930 935 940 

Leu Ser He Leu Thr Asp Pro Asp Gly Lys Gly Lys Glu Lys He Ala 
945 950 955 960 

Glu Leu Ser Ala Thr Ala Gin Glu He He Lys Ser Gin Ala He Ala 
965 970 975 

Thr Lys Lys He He Ser Asp Tyr His Gin Gin Phe Arg Tyr Lys Leu 
980 985 990 

Gin Asp Phe Ser Asp Gin Leu Ser Asp Tyr Tyr Glu Lys Phe He Ala 
995 1000 1005 

Glu Ser Lys Arg Leu He Asp Leu Ser He Gin Asn Tyr His Thr Phe 
1010 1015 1026 

Leu He Tyr He Thr Glu Leu Leu Lys Lys Leu Gin Ser Thr Thr Val 
1025 1030 1035 1040 

Met Asn Pro Tyr Met Lys Leu Ala Pro Gly Glu Leu Thr He He Leu 
1045 1050 1055 



(2) INFORMATION FOR SEQ ID NO: 218: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 989 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 218: 

Asn Ser Lys Gly Thr Arg Ser Ser Val Arg Leu Gin Gly Ala Ser Asn 
15 10 15 

Phe Ala Gly He Trp Asn Phe Glu Val Gly Glu Asn Phe Ala Gly Glu 
20 25 30 

Ala Thr Leu Arg Arg He Tyr Gly Thr Trp Glu His Asn Met He Asn 
35 40 45 

His Leu Gin Val Phe Ser Tyr Phe Asp Thr Lys Gly Lys Gin Thr Cys 
50 55 60 

Arg Ala Thr Leu Glu Leu Ser Pro Trp Thr Met Ser Thr Leu Leu Gin 
65 70 75 80 

Val His Val Ser Gin Pro Ser Pro Leu Phe Asp Leu His His Phe Asp 
85 90 95 

Gin Glu Val He Leu Lys Ala Ser Thr Lys Asn Gin Lys Val Ser Trp 
100 105 110 

Lys Ser Glu Val Gin Val Glu Ser Gin Val Leu Gin His Asn Ala His 
115 120 125 

Phe Ser Asn Asp Gin Glu Glu Val Arg Leu Asp He Ala Gly Ser Leu 
130 135 140 

Glu Gly Gin Leu Trp Asp Leu Glu Asn Phe Phe Leu Pro Ala Phe Gly 
145 150 155 160 

Lys Ser Leu Arg Glu Leu Leu Gin He Asp Gly Lys Arg Gin Tyr Leu 
165 170 175 

Gin Ala Ser Thr Ser Leu His Tyr Thr Lys Asn Pro Asn Gly Tyr Leu 
180 185 190 

Leu Ser Leu Pro Val Gin Glu Leu Thr Asp Arg Phe He He Pro Gly 
195 200 205 

Leu Lys Leu Asn Asp Phe Ser Gly He Lys He Tyr Lys Lys Leu Ser 
210 215 220 

Thr Ser Pro Phe Ala Leu Asn Leu Thr Met Leu Pro Lys Val Lys Phe 
225 230 235 240 

Pro Gly Val Asp Leu Leu Thr Gin Tyr Ser Lys Pro Glu Gly Ser Ser 
245 250 255 

Val Pro Thr Phe Glu Thr Thr He Pro Glu He Gin Leu Thr Val Ser 
260 265 270 

Gin Phe Thr Leu Pro Lys Ser Phe Pro Val Gly Asn Thr Val Phe Asp 
275 280 285 
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Leu Asn Lys Leu Thr Asn Leu lie Ala Asp Val Asp Leu Pro Ser He 
290 295 300 

Thr Leu Pro Glu Gin Thr He Glu He Pro Ser Leu Glu Phe Ser Val 
305 310 315 320 

Pro Ala Gly He Phe He Pro Phe Phe Gly Glu Leu Thr Ala His Val 
325 330 335 

Gly Met Ala Ser Pro Leu Tyr Asn Val Thr Trp Ser Thr Gly Trp Lys 
340 345 350 

Asn Lys Ala Asp His Val Glu Thr Phe Leu Asp Ser Thr Cys Ser Ser 
355 360 365 

Thr Leu Gin Phe Leu Glu Tyr Ala Leu Lys Val Val Gly Thr His Arg 
370 375 380 

He Glu Asn Asp Lys Phe He Tyr Lys He Lys Gly Thr Leu Gin His 
385 390 395 400 

Cys Asp Phe Asn Val Lys Tyr Asn Glu Asp Gly He Phe Glu Gly Leu 
405 410 415 

Trp Asp Leu Glu Gly Glu Ala His Leu Asp He Thr Ser Pro Ala Leu 
420 425 430 

Thr Asp Phe His Leu His Tyr Lys Glu Asp Lys Thr Ser Val Ser Ala 
435 440 445 

Ser Ala Ala Ser Pro Ala He Gly Thr Val Ser Leu Asp Ala Ser Thr 
450 455 460 

Asp Asp Gin Ser Val Arg Leu His Val Tyr Phe Arg Pro Gin Ser Pro 
465 470 475 480 

Pro Asp Asn Lys Leu Ser He Phe Lys Met Glu Trp Arg Asp Lys Glu 
485 490 495 

Ser Asp Gly Glu Thr Tyr He Lys He Asn Trp Glu Glu Glu Ala Ala 
500 505 510 

Phe Arg Leu Leu Asp Ser Leu Lys Ser Asn Val Pro Lys Ala Ser Glu 
515 520 525 

Ala Val Tyr Asp Tyr Val Lys Lys Tyr His Leu Gly His Ala Ser Ser 
530 535 540 

Glu Leu Arg Lys Ser Leu Gin Asn Asp Ala Glu His Ala He Arg Met 
545 550 555 560 



Val Asp Glu Met Asn Val Asn Ala Gin Arg Val Thr Arg Asp Thr Tyr 
565 570 575 
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Gln Ser Leu Tyr Lys Lys Met Leu Ala Gin Glu Ser Gin Ser lie Pro 
580 585 590 

Glu Lys Leu Lys Lys Met Val Leu Gly Ser Leu Val Arg He Thr Gin 
595 600 605 

Lys Tyr His Met Ala Val Thr Trp Leu Met Asp Ser Val He His Phe 
610 615 620 

Leu Lys Phe Asn Arg Val Gin Phe Pro Gly Asn Ala Gly Thr Tyr Thr 
625 630 635 640 

Val Asp Glu Leu Tyr Thr He Ala Met Arg Glu Thr Lys Lys Leu Leu 
645 650 655 

Ser Gin Leu Phe Asn Gly Leu Gly His Leu Phe Ser Tyr Val Gin Asp 
660 665 670 

Gin Val Glu Lys Ser Arg Val He Asn Asp He Thr Phe Lys Cys Pro 
675 680 685 

Phe Ser Pro Thr Pro Cys Lys Leu Lys Asp Val Leu Leu He Phe Arg 
690 695 700 

Glu Asp Leu Asn He Leu Ser Asn Leu Gly Gin Gin Asp lie Asn Phe 
705 710 715 720 

Thr Thr He Leu Ser Asp Phe Gin Ser Phe Leu Glu Arg Leu Leu Asp 
725 730 735 

He He Glu Glu Lys He Glu Cys Leu Lys Asn Asn Glu Ser Thr Cys 
740 745 750 

Val Pro Asp His He Asn Met Phe Phe Lys Thr His He Pro Phe Ala 
755 760 765 

Phe Lys Ser Leu Arg Glu Asn He Tyr Ser Val Phe Ser Glu Phe Asn 
770 775 780 

Asp Phe Val Gin Ser He Leu Gin Glu Gly Ser Tyr Lys Leu Gin Gin 
785 790 795 BOO 

Val His Gin Tyr Met Lys Ala Phe Arg Glu Glu Tyr Phe Asp Pro Ser 
805 810 815 

Val Val Gly Trp Thr Val Lys Tyr Tyr Glu He Glu Glu Lys Met Val 
820 825 830 

Asp Leu He Lys Thr Leu Leu Ala Pro Leu Arg Asp Phe Tyr Ser Glu 
635 840 845 

Tyr Ser Val Thr Ala Ala Asp Phe Ala Ser Lys Met Ser Thr Gin Val 
850 855 860 

Glu Gin Phe Val Ser Arg Asp He Arg Glu Tyr Leu Ser Met Leu Ala 
865 870 875 880 
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Asp lie Asn Gly Lys Gly Arg Glu 
885 

Val Lys Glu Arg He Lys Ser Trp 
900 

Ser Asp Tyr Leu Arg Gin Leu His 

915 920 

Gin Leu Ser Gly Tyr Tyr Glu Lys 
930 935 

lie Asp Leu Ser He Gin Asn Tyr 
945 950 

Glu Leu Leu Lys Lys Leu Gin Val 
965 

Tyr Leu Arg Phe Ala Gin Gly Glu 
980 



Lys Val Ala Glu Leu Ser He Val 
890 895 

Ser Thr Ala Val Ala Glu He Thr 
905 910 

Ser Lys Leu Gin Asp Phe Ser Asp 
925 

Phe Val Ala Glu Ser Thr Arg Leu 
940 

His Met Phe Leu Arg Tyr He Ala 

955 960 

Ala Thr Ala Asn Asn Val Ser Pro 
970 975 

Leu He He Thr Phe 
985 



(2) INFORMATION FOR SEQ ID NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 

Lys Asp Asn Val Phe Asp Gly Leu Val Arg Val Thr Gin Lys Phe His 
15 10 15 

Met Lys Val Lys His Leu He Asp Ser Leu He Asp Phe Leu Asn Phe 
20 25 30 

Pro Arg Phe Gin Phe Pro Gly Lys Pro Gly He Tyr Thr Arg Glu Glu 
35 40 45 

Leu Cys Thr Met Phe He Arg Glu Val Gly Thr Val Leu Ser Gin Val 
50 55 60 

Tyr Ser Lys Val His Asn Gly Ser Glu He Leu Phe Ser Tyr Phe Gin 
65 70 75 80 

Asp Leu Val He Thr Leu Pro Phe Glu Leu Arg Lys His Lys Leu He 
85 90 95 

Asp Val He Ser Met Tyr Arg Glu Leu Leu Lys Asp Leu Ser Lys Glu 
100 105 110 

Ala Gin Glu Val Phe Lys Ala He Gin Ser Leu Lys Thr Thr Glu Val 
115 120 125 
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Leu Arg Asn Leu Gin Asp Leu Leu Gin Phe lie Phe Gin Leu lie Glu 
130 135 140 

Asp Asn lie Lys Gin Leu Lys Glu Met Lys Phe Thr Tyr Leu lie Asn 
145 150 155 160 

Tyr lie Gin Asp Glu lie Asn Thr He Phe Asn Asp Tyr He Pro Tyr 
165 170 175 

Val Phe Lys Leu Leu Lys Glu Asn Leu Cys Leu Asn Leu His Lys Phe 
180 185 190 

Asn Glu Phe He Gin Asn Glu Leu Gin Glu Ala Ser Gin Glu Leu Gin 
195 200 205 

Gin He His Gin Tyr He Met Ala Leu Arg Glu Glu Tyr Phe Asp Pro 
210 215 220 

Ser He Val Gly Trp Thr Val Lys Tyr Tyr Glu Leu Glu Glu Lys He 
225 230 235 240 

Val Ser Leu He Lys Asn Leu Leu Val Ala Leu Lys Asp Phe His Ser 
245 250 255 

Glu Tyr He Val Ser Ala Ser Asn Phe Thr Ser Gin Leu Ser Ser Gin 
260 265 270 

Val Glu Gin Phe Leu His Arg Asn He Gin Glu Tyr Leu Ser He Leu 
275 280 285 

Thr Asp Pro Asp Gly Lys Gly Lys Glu Lys He Ala Glu Leu Ser Ala 
290 295 300 

Thr Ala Gin Glu He He Lys Ser Gin Ala He Ala Thr Lys Lys He 
305 310 315 320 

He Ser Asp Tyr His Gin Gin Phe Arg Tyr Lys Leu Gin Asp Phe Ser 
325 330 335 

Asp Gin Leu Ser Asp Tyr Tyr Glu Lys Phe He Ala Glu Ser Lys Arg 
340 345 350 

Leu He Asp Leu Ser He Gin Asn Tyr His Thr Phe Leu He Tyr He 
355 360 365 

Thr Glu Leu Leu Lys Lys Leu Gin Ser Thr Thr Val Met Asn Pro Tyr 
370 375 380 

Met Lys Leu Ala Pro Gly Glu Leu Thr He He Leu 
385 390 395 



(2) INFORMATION FOR SEQ ID NO: 220: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 433 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 

He Pro Gly Leu Ser Glu Lys Tyr Thr Gly Glu Glu Leu Tyr Leu Met 
15 10 15 

Thr Thr Glu Lys Ala Ala Lys Thr Ala Asp He Cys Leu Ser Lys Leu 
20 25 30 

Gin Glu Tyr Phe Asp Ala Leu He Ala Ala He Ser Glu Leu Glu Val 
35 40 45 

Arg Val Pro Ala Ser Glu Thr He Leu Arg Gly Arg Asn Val Leu Asp 
50 55 60 

Gin He Lys Glu Met Leu Lys His Leu Gin Glu Lys He Arg Gin Thr 
65 70 75 80 

Phe Val Thr Leu Gin Glu Ala Asp Phe Ala Gly Lys Leu Asn Arg Leu 
85 90 95 

Lys Gin Val Val Gin Lys Thr Phe Gin Lys Ala Gly Asn Met Val Arg 
100 105 110 

Ser Leu Gin Ser Lys Asn Phe Glu Asp He Lys Val Gin Met Gin Gin 
115 120 125 

Leu Tyr Lys Asp Ala Met Ala Ser Asp Tyr Ala His Lys Leu Arg Ser 
130 135 140 

Leu Ala Glu Asn Val Lys Lys Tyr He Ser Gin He Lys Asn Phe Ser 
145 150 155 160 

Gin Lys Thr Leu Gin Lys Leu Ser Glu Asn Leu Gin Gin Leu Val Leu 
165 170 175 

Tyr He Lys Ala Leu Arg Glu Glu Tyr Phe Asp Pro Thr Thr Leu Gly 
180 185 190 

Trp Ser Val Lys Tyr Tyr Glu Val Glu Asp Lys Val Leu Gly Leu Leu 
195 200 205 

Lys Asn Leu Met Asp Thr Leu Val He Trp Tyr Asn Glu Tyr Ala Lys 
210 215 220 

Asp Leu Ser Asp Leu Val Thr Arg Leu Thr Asp Gin Val Arg Glu Leu 
225 230 235 240 

Val Glu Asn Tyr Arg Gin Glu Tyr Tyr Asp Leu He Thr Asp Val Glu 
245 250 255 
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Gly Lys Gly Arg Gin Lys Val Met Glu Leu Ser Ser Ala Ala Gin Glu 
260 265 270 

Lys lie Arg Tyr Trp Ser Ala Val Ala Lys Arg Lys lie Asn Glu His 
275 280 285 

Asn Arg Gin Val Lys Ala Lys Leu Gin Glu lie Tyr Gly Gin Leu Ser 
290 295 300 

Asp Ser Gin Glu Lys Leu lie Asn Val Ala Lys Met Leu He Asp Leu 
305 310 315 320 

Thr Val Glu Lys Tyr Ser Thr Phe Met Lys Tyr He Phe Glu Leu Leu 
325 330 335 

Arg Trp Phe Glu Gin Ala Thr Ala Asp Ser He Lys Pro Tyr lie Ala 
340 345 350 

Val Arg Glu Gly Glu Leu Arg He Asp Val Pro Phe Asp Trp Glu Tyr 
355 360 365 

He Asn Gin Met Pro Gin Lys Ser Arg Glu Ala Leu Arg Asn Lys Val 
370 375 380 

Glu Leu Thr Arg Ala Leu He Gin Gin Gly Val Glu Gin Gly Thr Arg 
385 390 395 400 

Lys Trp Glu Glu Met Gin Ala Phe He Asp Glu Gin Leu Ala Thr Glu 
405 410 415 

Gin Leu Ser Phe Gin Gin He Val Glu Asn He Gin Lys Arg Met Lys 
420 425 430 

Thr 



(2) INFORMATION FOR SEQ ID NO: 221: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 

Asp Met Thr Phe Ser Lys Gin Asn Ala Leu Leu Arg Ser Glu Tyr Gin 
15 10 15 

Ala Asp Tyr Glu Ser Leu Arg Phe Phe Ser Leu Leu Ser Gly Ser Leu 
20 25 30 

Asn Ser His Gly Leu Glu Leu Asn Ala Asp He Leu Gly Thr Asp Lys 
35 40 45 
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lie Asn Ser Gly Ala His Lys Ala Thr Leu Arg lie Gly Gin Asp Gly 
50 55 60 

lie Ser Thr Ser Ala Thr Thr Asn Leu Lys Cys Ser Leu Leu Val Leu 
65 70 75 80 

Glu Asn Glu Leu Asn Ala Glu Leu Gly Leu Ser Gly Ala Ser Met Lys 
85 90 95 

Leu Thr Thr Asn Gly Arg Phe Arg Glu His Asn Ala Lys Phe Ser Leu 
100 105 110 

Asp Gly Lys Ala Ala Leu Thr Glu Leu Ser Leu Gly Ser Ala Tyr Gin 
115 120 125 

Ala Met lie Leu Gly Val Asp Ser Lys Asn lie Phe Asn Phe Lys Val 
130 135 140 

Ser Gin Glu Gly Leu Lys Leu Ser Asn Asp Met Met Gly Ser Tyr Ala 
145 150 155 160 

Glu Met Lys Phe Asp His Thr Asn Ser Leu Asn lie Ala Gly Leu Ser 
165 170 175 

Leu Asp Phe Ser 
180 



SEQUENCE DESCRIPTION: SEQ ID NO: 222: 

Leu Thr Phe Ser Lys Gin Asn Ala Leu Leu Arg Ala Glu Tyr Gin 
5 10 15 



(2) INFORMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) 

Asp 
1 

Ala Asp Tyr Lys Ser Leu Arg Phe 
20 

Asn Thr His Gly Leu Glu Leu Asn 
35 40 

Met Asn Thr Ala Ala His Lys Ala 
50 55 

Val Ser Thr Ser Ala Thr Thr Ser 
65 70 



Phe Thr Leu Leu Ser Gly Leu Leu 
25 30 

Ala Asp lie Leu Gly Thr Asp Lys 
45 

Thr Leu Arg lie Gly Gin Asn Gly 
60 

Leu Arg Tyr Ser Pro Leu Met Leu 
75 80 



Glu Asn Glu Leu Asn Ala Glu Leu Ala Leu Ser Gly Ala Ser Met Lys 
85 90 95 
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Leu Ala Thr Asn Gly Arg Phe Lys Glu His Asn Ala Lys Phe Ser Leu 
100 105 110 

Asp Gly Lys Ala Thr Leu Thr Glu Leu Ser Leu Gly Ser Ala Tyr Gin 
115 120 125 

Ala Met lie Leu Gly Ala Asp Ser Lys Asn lie Phe Asn Phe 
130 135 140 



SEQUENCE DESCRIPTION: SEQ ID NO : 223: 

lie Phe lie Pro Ala Met Gly Asn lie Thr Tyr Asp Phe Ser Phe 
5 10 15 



(2) INFORMATION FOR SEQ ID NO: 223: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 420 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) 

His 
1 

Lys Ser Ser Val lie Thr Leu Asn 
20 

Ser Asp lie Val Ala His Leu Leu 
35 40 



Thr Asn Ala Glu Leu Phe Asn Gin 
25 30 

Ser Ser Ser Ser Ser Val lie Asp 
45 



Ala Leu Gin Tyr Lys Leu Glu Gly Thr Thr Arg Leu Thr Arg Lys Arg 
50 55 60 

Gly Leu Lys Leu Ala Thr Ala Leu Ser Leu Ser Asn Lys Phe Val Glu 
65 70 75 80 

Gly Ser His Asn Ser Thr Val Ser Leu Thr Thr Lys Asn Met Glu Val 
85 90 95 

Ser Val Ala Lys Thr Thr Lys Ala Glu He Pro He Leu Arg Met Asn 
100 105 110 

Phe Lys Gin Glu Leu Asn Gly Asn Thr Lys Ser Lys Pro Thr Val Ser 
115 120 125 

Ser Ser Met Glu Phe Lys Tyr Asp Phe Asn Ser Ser Met Leu Tyr Ser 
130 135 140 

Thr Ala Lys Gly Ala Val Asp His Lys Leu Ser Leu Glu Ser Leu Thr 
145 150 155 160 

Ser Tyr Phe Ser He Glu Ser Ser Thr Lys Gly Asp Val Lys Gly Ser 
165 170 175 

Val Leu Ser Arg Glu Tyr Ser Gly Thr He Ala Ser Glu Ala Asn Thr 
180 185 190 
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Tyr Leu Asn Ser Lys Ser Thr Arg Ser Ser Val Lys Leu Gin Gly Thr 
195 200 205 

Ser Lys lie Asp Asp lie Trp Asn Leu Glu Val Lys Glu Asn Phe Ala 
210 215 220 

Gly Glu Ala Thr Leu Gin Arg lie Tyr Ser Leu Trp Glu His Ser Thr 
225 230 235 240 

Lys Asn His Leu Gin Leu Glu Gly Leu Phe Phe Thr Asn Gly Glu His 
245 250 255 

Thr Ser Lys Ala Thr Leu Glu Leu Ser Pro Trp Gin Met Ser Ala Leu 
260 265 270 

Val Gin Val His Ala Ser Gin Pro Ser Ser Phe His Asp Phe Pro Asp 
275 280 285 

Leu Gly Gin Glu Val Ala Leu Asn Ala Asn Thr Lys Asn Gin Lys lie 
290 295 300 

Arg Trp Lys Asn Glu Val Arg He His Ser Gly Ser Phe Gin Ser Gin 
305 310 315 320 

Val Glu Leu Ser Asn Asp Gin Glu Lys Ala His Leu Asp He Ala Gly 
325 330 335 

Ser Leu Glu Gly His Leu Arg Phe Leu Lys Asn He lie Leu Pro Val 
340 345 350 

Tyr Asp Lys Ser Leu Trp Asp Phe Leu Lys Leu Asp Val Thr Thr Ser 
355 360 365 

He Gly Arg Arg Gin His Leu Arg Val Ser Thr Ala Phe Val Tyr Thr 
370 375 380 

Lys Asn Pro Asn Gly Tyr Ser Phe Ser He Pro Val Lys Val Leu Ala 
385 390 395 400 

Asp Lys Phe He Thr Pro Gly Leu Lys Leu Asn Asp Leu Asn Ser Val 
405 410 415 



Leu Val Met Pro 
420 



(2) INFORMATION FOR SEQ ID NO: 224: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 275 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224: 
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Met Ala Ser Glu Lys Gly Pro Ser Asn Lys Asp Tyr Thr Leu Arg Arg 
15 10 15 

Arg lie Glu Pro Trp Glu Phe Glu Val Phe Phe Asp Pro Gin Glu Leu 
20 25 30 

Arg Lys Glu Ala Cys Leu Leu Tyr Glu lie Lys Trp Gly Ala Ser Ser 
35 40 45 

Lys Thr Trp Arg Ser Ser Gly Lys Asn Thr Thr Asn His Val Glu Val 
50 55 60 

Asn Phe Leu Glu Lys Leu Thr Arg Lys Glu Ala Cys Leu Leu Tyr Glu 
65 70 75 80 

lie Lys Trp Gly Ala Ser Ser Lys Thr Trp Arg Ser Ser Gly Lys Asn 
85 90 95 

Thr Thr Asn His Val Glu Val Asn Phe Leu Glu Lys Leu Thr Ser Glu 
100 105 110 

Gly Arg Leu Gly Pro Ser Thr Cys Cys Ser lie Thr Trp Phe Leu Ser 
115 120 125 

Trp Ser Pro Cys Trp Glu Cys Ser Met Ala lie Arg Glu Phe Leu Ser 
130 135 140 

Gin His Pro Gly Val Thr Leu lie He Phe Val Ala Arg Leu Phe Gin 
145 150 155 160 

His Met Asp Arg Arg Asn Arg Gin Gly Leu Lys Asp Leu Val Thr Ser 
165 170 175 

Gly Val Thr Val Arg Val Met Ser Val Ser Glu Tyr Cys Tyr Cys Trp 
180 185 190 

Glu Asn Phe Val Asn Tyr Pro Pro Gly Lys Ala Ala Gin Trp Pro Arg 
195 200 205 

Tyr Pro Pro Arg Trp Met Leu Met Tyr Ala Leu Glu Leu Tyr Cys He 
210 215 220 

He Leu Gly Leu Pro Pro Cys Leu Lys lie Ser Arg Arg His Gin Lys 
225 230 235 240 

Gin Leu Thr Phe Phe Ser Leu Thr Pro Gin Tyr Cys His Tyr Lys Met 
245 250 255 

He Pro Pro Tyr He Leu Leu Ala Thr Gly Leu Leu Gin Pro Ser Val 
260 265 270 

Pro Trp Arg 
275 
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(2) INFORMATION FOR SEQ ID NO: 225: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 589 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225: 

GGATCTGACG GTTCACTAAA CCAGCTCTGC TTATATAGAC CTCCCACCGT ACACGCCTAC 60 

CGCCCATTTG CGTCAATGGG GCGGAGTTGT TACGACATTT TGGAAAGTCC CGTTGATTTT 120 

GGTGCCAAAA CAAACTCCAT TGACGTCAAT GGGGTGGAGA CTTGGTVAATC CCCGTGAGTC 180 

AAACCGCTAT CCACGCCCAT TGATGTACTG CCAAAACCGC ATCACCATGG TAATAGCGAT 240 

GACTAATACG TAGATGTACT GCCAAGTAGG AAAGTCCCAT AAGGTCATGT ACTGGGCATA 300 

ATGCCAGGCG GGCCATTTAC CGTCATTGAC GTCAATAGGG GGCGTACTTG GCATATGATA 360 

CACTTGATGT ACTGCCAAGT GGGCAGTTTA CCGTAAATAC TCCACCCATT GACGTCAATG 420 

GAAAGTCCCT ATTGGCGTTA CTATGGGAAC ATACGTCATT ATTGACGTCA ATGGGCGGGG 480 

GTCGTTGGGC GGTCAGCCAG GCGGGCCATT TACCGTAAGT TATGTAACGC GGAACTCCAT 540 

ATATGGGCTA TGAACTAATG ACCCCGTAAT TGATTACTAT TAATAACTA 589 

(2) INFORMATION FOR SEQ ID NO: 226: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 226: 
GATCCAAATC ACCCACTGCA ACTCCTCCCC CTGCG 35 

(2) INFORMATION FOR SEQ ID NO: 227: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227: 
GATCCATCCA ATTGGGCAAT CAGGAG 26 
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(2) INFORMATION FOR SEQ ID NO: 228: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 228: 
GATCCGGTCT CCAATTGG 18 



(2) INFORMATION FOR SEQ ID NO: 229: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 229: 
GATCCTCGGG AAAGGGAAAC CGAAACTGAA GCCG 34 
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CLAIMS : 

1 . A composition comprising: 

(a) an isolated polypeptide comprising at least one LDL or VLDL nucleic 
acid binding domain; and 

(b) a nucleic acid comprising an LDL or VLDL binding sequence, 
wherein said nucleic acid is bound to said polypeptide. 

2. The composition of claim 1, wherein said polypeptide comprises an LDL nucleic acid 
binding domain. 

3. The composition of claim 1 , wherein said polypeptide comprises a VLDL nucleic acid 
binding domain. 

4. The composition of claim 1, wherein said nucleic acid comprises an expression region 
operably linked to a promoter active in eukaryotic cells. 

5. The composition of claim 4, wherein said expression region encodes a polypeptide. 

6. The composition of claim 4, wherein said expression region comprises an antisense 
construct. 

7. The composition of claim 5, wherein said polypeptide is selected from the group 
consisting of a-globin, P-globin, y-globin, granulocyte macrophage-colony stimulating 
factor (GM-CSF), tumor necrosis factor (TNF), IL-2, IL-3, IL-4, IL-5, IL.6, IL-7, IL-8, 
IL-9, IL-10, IL-11, IL-12, IL-13, IL-14, IL-15, ^-interferon, y-interferon, cytosine 
deaminase, adenosine deaminase, ^-glucuronidase, hypoxanthine guanine 
phosphoribosyl transferase, galactose- 1 -phosphate uridyltransferase, glucocerbrosidase, 
glucose-6-phosphatase, thymidine kinase, lysosomal glucosidase, growth hormone, 
nerve grov^h factor, insulin, adrenocorticotropic hormone, parathormone, follicle- 
stimulating hormone, luteinizing hormone, epidermal growth factor, thyroid stimulating 
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hormone of CFTR, EGFR, VEGFR, IL-2 receptor, estrogen receptor, Bax, Bak, Bcl-Xj, 
Bik, Bid, Bad, Harakiri, Ad ElB, an ICE-CED3 protease neomycin resistance, 
luciferase, adenine phosphoribosyl transferase (APRT), retinoblastoma, insulin, mast 
cell growth factor, p53, pl6, p21, MMACl, p73, zacl and BRCAI. 

5 

8. The composition of claim 6, wherein said antisense construct is complementary to a 
segment of an oncogene, 

9. The composition of claim 8, wherein said oncogene is selected from the group 
1 0 consisting of ras, myc. neu, raf. erb, src, fins, jun, trk, ret, gsp. hst, bcl and abL 



15 



10. The composition of claim 4. wherein said promoter is selected from the group consisting 
of CMV IE, LTR, SV40 IE, HSV tk, p-actin, human globin a, human globin p and 
human globin y promoter. 

11. The composition of claim 1, wherein said nucleic acid binding domain is an apoBlOO 
nucleic acid binding domain. 



12. The composition of claim 1 . wherein said composition further comprises one or more 
20 lipoproteins selected from the group consisting of apoAl, apoA-II, apoA-lV, acat, apoE, 

apoC-II, apoC-lIl and apo-D. 

13. The composition of claim IK wherein said apoBlOO is selected from the group 
consisting of human, rat and baboon apoBlOO. 

25 

14. The composition of claim 1, wherein said polypeptide comprises at least two nucleic 
acid binding domains. 

15. The composition of claim 14, wherein said nucleic acid binding domain contains a motif 
30 selected from the group consisting of a proline pipe helix DNA binding motif, a 
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ISGF3y-Iike DNA binding motif, a SREBP-like DNA binding motif, a coiled-coil motif 
and a nucleotide (ATP)-binding motif. 

16. The composition of claim 14, wherein said binding domain is selected from the group 
consisting of SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:82, SEQ ID 
NO:83, SEQ ID NO;85, SEQ ID NO:86, SEQ ID NO:87, SEQ ID NO:88, SEQ ID 
NO:89, SEQ ID NO:90, SEQ ID N0:91, SEQ ID NO:92, SEQ ID NO:93, SEQ ID 
NO:94, SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:98, SEQ ID 
NO:99, SEQ ID NO: 100, , SEQ ID NO: 101, SEQ ID NO: 102, SEQ ID NO: 103, SEQ 
ID NO: 1 05, SEQ ID NO: 1 06, SEQ ID NO: 1 07, SEQ ID NO: 1 08, SEQ ID NO: 1 09, 
SEQ ID NO: 1 1 0, SEQ ID NO: 11 1 . SEQ ID NO: 11 2, NO: 1 1 3, SEQ ID NO: 1 1 4, SEQ 
ID NO:l 15, SEQ ID NO: 144, SEQ ID NO: 145, SEQ ID NO: 146, SEQ ID NO: 147, 
SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, SEQ ID NO: 151, SEQ ID 

NO: 152, SEQ ID NO: 153, SEQ ID NO: 154, SEQ ID NO: 163, SEQ ID NO: 164, SEQ 
ID NO:165, SEQ ID NO:166 and SEQ ID NO:175. 

1 7. The composition of claim 1 , wherein said polypeptide fiirther comprises at least one 
nuclear localization sequence. 

1 8. The composition of claim 1 7, wherein said nuclear localization sequence is from 
apoBlOO. 

19. The composition of claim 17, wherein said nuclear localization sequence is selected 
from the group consisting of SEQ ID NO: 178, SEQ ID NO: 179, SEQ ID NO: 180, SEQ 
ID NO: 194. SEQ ID NO: 195, SEQ ID NO: 196, SEQ ID NO: 197, SEQ ID NO: 198, 
SEQ ID NO: 199, SEQ ID NO: 200, SEQ ID NO: 201. SEQ ID NO: 202, SEQ ID NO: 
203, SEQ ID NO: 204, SEQ ID NO: 205, SEQ ID NO: 206, SEQ ID NO: 207, SEQ ID 
NO: 208, SEQ ID NO: 209, SEQ ID NO: 210. 

20. A method for expressing a polypeptide in a hiunan cell comprising: 
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(a) providing a composition comprising (i) an isolated polypeptide comprising at 
least one LDL or VLDL nucleic acid binding domain and (ii) a nucleic acid 
comprising an expression cassette comprising a sequence encoding said 
polypeptide and a promoter active in eukaryotic cells, wherein said coding 
sequence is operably linked to said promoter, and wherein said nucleic acid 
sequence is bound to said LDL or VLDL; 

b) contacting said composition with said cell under conditions permitting transfer 
of said composition into said cell; and 

c) culturing said cell under conditions permitting the expression of said 
polypeptide. 

21 . The method of claim 20, wherein said polypeptide is a tumor suppressor. 

22. The method of claim 20, wherein said polypeptide is a cytokine. 

23. The method of claim 20, wherein said polypeptide is an enzyme. 

24. The method of claim 20, wherein said polypeptide is a hormone. 

25. The method of claim 20, wherein said polypeptide is a receptor. 

26. The method of claim 20, wherein said polypeptide is an inducer of apoptosis. 

27. The method of claim 21, wherein said tumor suppressor is selected from the group 
consisting of p53, pl6, p21, MMACl, p73, zacl, BRCAI and Rb. 

28. The method of claim 22, wherein said cytokine is selected from the group consisting of 
IL-2, IL-2, IL-3, IL-4, IL-S, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-I2, IL-13, IL-H, 
IL-15, TNF, GMCSF, ^-interferon and y-interferon. 
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29. The method of claim 23, wherein said enzyme is selected from the group consisting of 
cytosine deaminase, adenosine deaminase, p-glucuronidase, hypoxanthine guanine 
phosphoribosyl transferase, galactose- 1 -phosphate uridyltransferase, glucocerbrosidas'e, 
glucose-6-phosphatase, thymidine kinase and lysosomal glucosidase. 

30. The method of claim 24, wherein said hormone is selected from the group consisting of 
growth hormone, nerve growth factor, insulin, adrenocorticotropic hormone, 
parathormone, follicle-stimulating hormone, luteinizing hormone, epidermal growth 
factor and thyroid stimulating hormone. 

31. The method of claim 25, wherein said receptor is selected from the group consisting of 
CFTR, EGFR, VEGFR, IL-2 receptor and the estrogen receptor. 

32. The method of claim 26, wherein said inducer of apoptosis is selected from the group 
consisting of Bax, Bak, Bcl-Xs, Bik, Bid, Bad, Harakiri, Ad El B and an ICE-CED3 
protease. 

33. The method of claim 20, wherein said promoter is selected from the group consisting of 
CMV IE, LTR, SV40 IE, HSV tk, p-actin, human globin a. human globin p and human 
globin Y promoter. 

34. The method of claim 20, wherein said nucleic acid binding domain is an apoB 1 00 
nucleic acid binding domain. 

35. The method of claim 20, wherein said apoBlOO is selected from the group consistmg of 
human, rat and baboon low density apoBlOO. 

36. The method of claim 27, wherein said binding region is selected from the group 
consisting of a proline pipe helix DNA binding motif, a ISGPSy-like DNA binding 
motif, a SREBP-like DNA binding motif, a coiled-coil motifs, and a nucleotide (ATP)- 
binding motif. 
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37. The method of claim 20, wherein said polypeptide further comprises at least one nuclear 
localization sequence. 

38. The method of claim 37, wherein said nuclear localization sequence is an apoBlOO 
nuclear localization sequence. 

39. The method of claim 20, wherein said polypeptide is selected from the group consisting 
of a-globin, P-globin, y-globin, neomycin resistance, luciferase, adenine phosphoribosyl 
transferase (APRT), mast cell growth factor. 

40. A method for providing an expression construct to a human cell comprising: 

(a) providing a composition comprising (i) an isolated polypeptide comprising at 
least one LDL or VLDL nucleic acid binding domain and (ii) an expression 
cassette comprising a nucleic acid sequence encoding an expression region and a 
promoter active in eukaryotic cells, wherein said expression region is operably 
linked to said promoter, and wherein said nucleic acid sequence is bound to said 
LDL or VLDL; 

b) contacting said composition with said cell under conditions pemiitting transfer 
of said composition into said cell; and 

c) culturing said cell under conditions permitting the expression of said expression 
region. 

41. The method of claim 40, wherein said expression construct comprises an antisense 
construct. 

42. The method of claim 40, wherein said antisense construct is derived from an oncogene. 

43. The method of claim 42, wherein said oncogene is selected from the group consisting 
ras, myc, neu. raf erb, src, fins, jun, trk, ret, gsp, hst, bcl and abL 
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44. The method of claim 40, wherein said expression construct comprises a nucleic acid 
coding for a gene. 

45. The method of claim 44, wherein said gene encodes a polypeptide. 

46. The method of claim 40, wherein said promoter is selected from the group consisting of 
CMV IE, LTR, SV40 IE, HSV tk, p-actin, human globin a, human globin p and human 
globin y promoter. 

47. The method of claim 40, wherein said nucleic acid binding domain is an apoBlOO 
nucleic acid binding domain. 

48. The method of claim 47, wherein said apoBlOO is selected from the group consisting of 
human, rat and baboon low density apoBIOO. 

49. The method of claim 48, wherein said DNA binding region is selected from the group 
consisting of a proline pipe helix DNA binding motif, a ISGF3y-like DNA binding 
motif, a SREBP-like DNA binding motif, a coiled-coil motifs, and a nucleotide (ATP)- 
binding motif. 

50. The method of claim 40, wherein said polypeptide further comprises at least one nuclear 
localization sequence. 

5 1 . The method of claim SO, wherein said nuclear localization sequence is an apoB 1 00 
nuclear localization sequence. 

52. The method of claim 40, wherein said gene encodes a polypeptide selected from the 
group consisting of a-globin, P-globin, y-globin, green fluorescent protein, neomycin 
resistance, luciferase, adenine phosphoribosyl transferase (APRT), mast cell growth 
factor. 
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53. A method for treating a human disease comprising: 

a) providing a composition comprising (i) an isolated polypeptide comprising at 
least one LDL or VLDL nucleic acid binding domain and (ii) an expression 
cassette comprising a nucleic acid sequence encoding an expression region and a 

5 promoter active in eukaryotic cells, wherein said expression region is operably 

linked to said promoter, and wherein said nucleic acid sequence is bound to said 
LDL or VLDL; and 

b) administering said composition to a human subject having said disease under 
conditions permitting transfer of said composition into cells of said human 

10 subject. 

54. The method of claim 53, wherein said disease is selected from the group consisting of 
cancer, diabetes, cystic fibrosis and arteriosclerosis. 

1 5 55. The method of claim 53, wherein said promoter is selected from the group consisting of 
CMV IE, LTR, SV40 IE, HSV tk, P-actin, human globin a, human globin p and human 
globin y promoter. 

56. The method of claim 53, wherein said nucleic acid binding domain is an apoBlOO 
20 binding domain. 

57. The method of claim 56, wherein said apoBlOO is selected from the group consisting of 
human, rat and baboon low density lipoprotein apoBlOO. 

25 58. The method of claim 53, wherein said polypeptide comprises at least two nucleic acid 
binding regions. 

59. The method of claim 58, wherein said binding region is selected from the group 

consisting of a proline pipe helix DNA binding motif, a ISGF3Y-like DNA binding 
30 motif, a SREBP-like DNA binding motif, a coiled-coil motifs, and a nucleotide (ATP)- 

binding motif. 



SUBSTITUTE SHEET (RULE 25) 



wo 98/56938 



-231 - 



PCT/US98/11927 



60. The method of claim 53, wherein said polypeptide comprises at least one nuclear 
localization sequence. 

6 1 . The method of claims 60, wherein said nuclear localization sequence is an apoB 1 00 
nuclear localization sequence. 

62. The method of claim 53, wherein said nucleic acid encodes a gene. 

63. The method of claun 53, wherein said expression construct comprises an antisense 
construct. 

64. A pharmaceutical composition comprising: 

(a) an isolated polypeptide comprising at least one LDL or VLDL nucleic acid 
binding domain; and 

(b) a nucleic acid comprising an LDL or VLDL binding sequence, wherein said 
nucleic acid is bound to said polypeptide; 

said pharmaceutical composition being dispersed in a suitable diluent. 

65. A method of transforming a cell comprising: 

a) providing a cell; 

b) contacting said cell with a composition comprising (i) an isolated polypeptide 
comprising at least one LDL or VLDL nucleic acid binding domain and (ii) an 
expression cassette comprising a nucleic acid sequence encoding an expression region 
and a promoter active in eukaryolic cells, wherein said expression region is operably 
linked to said promoter, and wherein said nucleic acid sequence is bound to said LDL or 
VLDL; 

wherein expression of said expression region is indicative of said transformation. 

66. A method of transfecting a cell comprising the steps of: 
a) providing a cell; 
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b) contacting said cell with a composition comprising (i) an isolated polypeptide 
comprising at least one LDL or VLDL nucleic acid binding domain and (ii) an 
expression cassette comprising a nucleic acid sequence encoding an expression region 
and a promoter active in eukaryotic cells, wherein said expression region is operably 
5 linked to said promoter, and wherein said nucleic acid sequence is bound to said LDL or 

VLDL; and 

wherein expression of said expression region is indicative of said transfection. 
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Identification of the regions of apo B-lOO and the 
proteins compared in Figures 2A-2D. 



Reference Protein Name* 


SEO ID NO 


Apo B-lOO region Bl (aa 24-69) 


SEQ ID NO: 3 


r9 (aa 66-114). cell division control 
protein 25 gim|4857 


SEQ ID NO: 4 


Apo B-lOO region B2 (aa 75-119) 


SEQ ID N0:5 


r33 (aa 69-114). Abl proto-oncogene 
tyrosine kinase (P150) gim| 13887 


SEQ ID NO: 6 


Apo B-lOO region B3-1 (aa 240-283) 


SEQ ID NO: 7 


rSS (aa 799-841). 1- 

Phosphati dyl i nosi tol -4 . 5-bi sphosphate 

phosphodiesterase gamma (PLC-gamma. 


SEQ ID NO: 8 


Ann R-inn ron-inn 9 ^aa Q/IH OQA\ 


crn in Mn.n 


rl8 (aa 69-114). Lck proto-oncogene 


SEQ ID NO: 10 


Ann R-inn rpninn R4 ( A^7 




r52 (aa 57-109). BLK protein tyrosine 
kinase (B ImDhocyte kinase) (P55-BLK) 
gim 113991. 


SEQ ID NO: 12 


Apo B-lOO region 85 (aa 652-700) 


SEQ ID NO: 13 


r34 (aa 984-1031). Myosin IC heavy 
chain gim | 16466 


SEQ ID NO: 14 


Apo B-lOO region 86 (aa 711-756) 


SEQ ID NO: 15 


r25 (aa 12-61). Phosphatidyl inositol 
3-OH gim 1 18072 


SEQ ID NO: 16 



FI6. 2E 

SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 



PCT/US98/11927 



9/56 

Identification ofthe regions of apo B-lOO and the 
proteins compared in Figures 2A-2D. 



Apo B-lOO region B9-1 (aa 2497-2547) 


SEQ ID NO. 19' 


r35-2 (aa 800-850). 1- 
Phosphatidyl inositol -4, 5-bisphosphate 
phosphodiesterase gamma. (PLC-gamma. 
PLC-II) gimi 18895 


SEQ ID NO: 20 


Apo B-lOO region B9-2 (aa 2497-2551) 


SEQ ID NO: 21 


r43 (aa 444-496). nuclear fusion 
protein FUSl gim|9498 


SEQ ID NO: 22 


r49 (aa 86-134). Pgr Proto- oncogene 
Tyrosine gim|l4097 


SEQ ID N0:23 


Apo B-lOO region BIO (aa 3311-3355) 


SEQ ID N0:24 


r9-2 (aa 66-114). Cell division control 
proLein cd gin[i|4oo/ 


SEQ ID N0:25 


apu D-iuu region oil laa o4o4-o4o£) 


cpn T r\ Kir\ o/" 

SEQ ID N0:26 


iH/ vdd cc)j-cou) . iNeULropm 1 Lytosoi 
Factor 1 (NCF-47K) gim| 16659 


errs TPi k\r\ o*? 

SEQ ID NO: 27 


r\r\j D iuu rt^yiun oic \gq ovu/'o/iO) 


ccr\ in Kin oo 

bLU ID NO: 28 


ro kdd iod-^ui ;Dein-i protein gim jyub 


ohLI iL) NU:29 


Apo B-lOO region B13 (aa 4053-4099) 




r3-2 (aa 163-214)Bem-l protein gim 1 3905 


SEQ ID NO: 31 


Apo B-lOO region B14 (aa 4180-4222) 


SEQ ID NO: 32 


r36 (aa 248-299). Neutrophil NADPH 
oxidase factor (P67-PH0X) gim 1 16660 


SEQ ID NO: 33 


Apo B-lOO region B15 (aa 4179-422) 


SEQ ID NO: 34 


r59. Cytoplasmic protein gim 16669 


SEQ ID N0:35 
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